# Intro to Python - Darden 

Today we will be taking a crash course through the world of python. Python is a free and open source interpreted programming language, used for a huge variety of purposes. Let's start with the basics.

# Basic Data Types

These are the basic data structures that compose your data in Python. Python interprets these differently and they all have their own unique properties

## Variables
A variable is a reserved memory location to store values. A variable is the name python gives the computer for processing. All variables have a data type and size. You can give your variable (almost) any name and store (almost) anything in it.

## Strings
Strings are interpreted as text


## Comments
Anything on a line after the # is ignored by the interpreter.
Use comments to turn stuff on and off, document your code, etc.

In [None]:
#Strings and variables   (this is a comment)

variable1 = "apple"
variable2 = "1234"
variable3 = "This is a string"
Variable3 = "This is also a string"     #variable names are case sensitive!


## Functions

Functions are re-usable pieces of code. Some come with your base installation of python (like print() below). Some are imported from other libraries. Some you write yourself.

A function should do ONE thing. If you need to do an operation 10,000 times, functions come in handy!

In [None]:
#print() is a function you will probably use a lot. print()
#prints the output to the console. The console is usually a 
#command line-esque box, like you see below when running this cell

print(variable1)
print(variable2)
print(variable3)
print(Variable3)

## Ints and Floats
These are both numeric data types but are not the same!

In [None]:
num1 = 3        #int
num2 = 3.0      #float

print("num1 is a: ",type(num1))
print("num2 is a: ", type(num2))


In [None]:
#you can do math with numbers, variables, and strings!

num3 = 3 + 1
num4 = num1 + 10
num5 = "Hello"
num6 = "World"
num7 = num5 + num6

print(num3)
print(num4)
print(num7)

## Error Messages
Get used to them. 

Error messages attempt to tell you what broke in your code, and where. Sometimes they are helpful, sometimes not. Sometimes an error in your code does not manifest itself until later.

Error messages are meant to be read from the bottom up. In the below error, you see the type is a 'TypeError' and it tells us we cannot add an 'int' to a 'str'. When outside Jupyter Notebooks, you will also be told which line in your code broke.

In [None]:
#You can't add an integer and a string
print(3 + "Hello World")

## Lists and Dictionaries
These data types can hold multiple items in them. They function a bit differently though and are used for different purposes.

### List
A list is an ordered, mutable (changeable) collection of objects.
In python, a list is written with square brackets [ ]. Items in a list are accessed through indexing.

### Dictionary
A dictionary is unordered, mutable, and indexed. In python, a dictionary is written with curly braces {}. Items in the dictionary are stored as key,value pairs. The value is accessed through the key. The first item is the key, second is the value.

In [None]:
#this is a list
cities_list = ['New York', 'London', 'Bangkok', 'Tokyo', 'Mumbai']

#this is a dictionary
cities_dict = {'United States' : 'Washington',
               'France' : 'Paris',
               'China' : 'Beijing',
               'India' : 'New Delhi',
               'Australia' : 'Canberra',
               'Iran' : 'Tehran'}

In [None]:
#Indexing: Rembember that lists are ORDERED. This means every item
#in the list has a location, start at 0. 

print(cities_list)

#print individual items from the list
print(cities_list[0], cities_list[1])

#indexing also works in reverse
print(cities_list[-1])

#assign variable names to list items (remember, you can assign a variable name to anything)
important_city = cities_list[3]
print(important_city)

In [None]:
#Dictionaries are accessed through the key name. Dictionaries are
#not ordered, so you cannot use indexing.

#print the whole thing
print(cities_dict)

#this is the syntax to access an individual value, via the key
print(cities_dict['United States'])
print(cities_dict['Australia'])

#remember, you can give a create a new variable for anything
india_capital = cities_dict['India']
#f strings. Make your strings look nice!
print(f"The capital of India is {india_capital}")

In [None]:
#You cannot access a dictionary item using the value
print(cities_dict['Paris'])

## Loops
Loops are another fundamental concept in programming. Basically, loops allow you to do an operation multiple times

In [None]:
#spaces are (usually) NOT significant in python
#but indention IS significant. Anything under and indention is 
#considered part of the code above it

for city in cities_list:
    print(city)

In [None]:
numbers_list = [1,2,3,4,5]

for number in numbers_list:
    number = number + 1
    print(number)

In [None]:
for key, value in cities_dict.items():
    print(f"{value} is the capital of {key}")

## Imports
The world of python is wide. Many other people have written code that you can import and use. You can also write your own code, and then write something else later and import what you've written. Some extra functionality is built into python (or Anaconda). Otherwise you need to install extra stuff before using it.

In [None]:
#the random module comes with base python and provides all kinds
# of functions to allow you to do things with random numbers, etc.
import random

In [None]:
#now I can use any feature of the random module. There are many.

#this prints a random integer between 0 and 20
random_number = random.randint(0,20)

print(random_number)     #this will change!

## Conditionals & Booleans
Think of conditionals like a fork in the road. If one thing happens with the code, do this. If something else happens with the code, do that.

Basically, the conditional statement evaluates your code as 'true' or 'false'. Then, makes a decision based on that value.

These are also called "if/elif/else" statements. See the following example to see how it happens



In [None]:
random_number = random.randint(0, 100)
print(f"The random number is: {random_number}")

if random_number > 50:      #if this statement is true?, do this
    print("The number is greater than 50")
elif random_number < 50:    #if not, is this statement true? do this
    print("The number is less than 50")
else:                        #if none of the above are true, do this
    print("The number is exactly 50")



# REAL WORLD EXAMPLES

Now that you have seen a very quick tour of the basics, let's work with some real world data snd see these things in action.

First, let's gather some data

## API Data - [OpenWeatherMap](https://openweathermap.org/)
Let's gather some weather data from OpenWeatherMap. OpenWeatherMap is a weather service, which exposes their data through an API

### API 
[Application programming interface](https://www.howtogeek.com/343877/what-is-an-api/). Basically, an API allows computers (or applications) to talk to one another. OpenWeatherMap has made their data publicly available, so you can gather some data and then use it for your own purposes. 

Basically, you sign up to use the [OpenWeatherMap API](https://openweathermap.org/api) and they provide documentation of how to use the service, with code examples. I have followed the documentation examples and now can gather weather data from them.

In [None]:
!conda install --yes requests

In [None]:
#Now we need to import these libraries in order to use them

import requests
import json      #this library is already installed in base python

In [None]:
#Note: this is my API key and is only being used for example purposes. Please don't spam OpenWeatherMap using my API key.
#sign up for your own API key here: https://home.openweathermap.org/users/sign_up
app_id = '333de4e909a5ffe9bfa46f0f89cad105'                    

#Each city in the world has a unique id number. There are over 1,000,000 so I have given you a few to start with.
#You are welcome to look in the data dictionary for more.
city_id_dict = {'Charlottesville': 4752046,                                     
                'New York': 5128581,
                'Chicago': 4887398,
                'Paris': 6455259,
                'Cape Town': 3369157,
                'Beirut': 276781,
                'Dubai': 292223,
                'Shanghai': 1796236,
                'Moscow': 524901,
                'Addis Ababa': 344979,
                'Bangkok': 1609350,
                'Oslo': 6453366,
                'Sao Paulo': 3448439,
                'Bogota': 3688689,
                'Havana': 3553478}

## Making HTTP Requests
We will use the .get() function from the requests library to submit an HTTP request to OpenWeatherMap for today's weather. 

Basically, we submit a request via a URL. This goes to the server at OpenWeatherMap, and then data is sent back to us. 

In [None]:
city_name = 'Oslo'        #change the city name here
city_id_string = str(city_id_dict[f'{city_name}'])                                         

#Make a request to get today's weather. This is straight from the documentation.
request = requests.get(f'http://api.openweathermap.org/data/2.5/group?APPID={app_id}&id={city_id_string}&units=imperial')               #this actually makes the request to the API via the URL with correct parameters

## JSON
JSON stands for "Javascript Object Notation". Basically, JSON is just a structured textual data format. Just like CSV. JSON is not specific to python and can be read in many different languages. 

JSON data in python terms is basically nested dictionaries and lists

In [None]:
#Now let's look at the data. See that is is nested dictionaries and lists

json_data = json.loads(request.text)
print(json_data)


In [None]:
#Take my word for it, this is how to drill down and
#get to the interesting information

temp_today = json_data['list'][0]['main']['temp']
print(f"Today in {city_name} the temperature is {temp_today}F")

In [None]:
#Now lets gather the temperature today for all cities in city_id_dict
#First I am going to extract the city_names from the dictionary

city_names = []

for city in city_id_dict.keys():
    city_names.append(city)
    
print(city_names)   #now I have a list of cities
    



In [None]:
#Now let's get today's temperature for each city

weather_data = {}

for city in city_names:
    
    city_id_string = str(city_id_dict[f'{city}'])
    
    request = requests.get(f'http://api.openweathermap.org/data/2.5/group?APPID={app_id}&id={city_id_string}&units=imperial')       
    
    json_data = json.loads(request.text)
    
    temp_today = json_data['list'][0]['main']['temp']
    
    weather_data[city] = temp_today  #I am updating the weather_data dictionary with the weather info

print(weather_data)

In [None]:
#Now I can do fun things with this data in the dictionary

coldest_weather = min(weather_data, key=weather_data.get)
hottest_weather = max(weather_data, key=weather_data.get)

print(f"The coldest weather today is in {coldest_weather}")
print(f"The hottest weather today is in {hottest_weather}")

# Pandas
[Pandas Documentation](https://pandas.pydata.org/)

A heavily used library which gives you a "spreadsheet-like" view of your data and allows you to easily manipulate it.

Pandas uses the 'dataframe' object which organizes data into rows and columns. You can access the data in these rows and columns using either column headers or row indexing. A pandas dataframe is a 2-Dimensional object, and each individual row or column in the dataframe is a pandas series, which is a 1-Dimensional object.

In [None]:
#first we have to import pandas

#I have renamed pandas as 'pd'. This is not required but is convention
#basically I have given a variable name to an entire library!
import pandas as pd    

In [None]:
#Now lets take our weather_data and make it a pandas dataframe
#notice, I renamed weather_data, effectively overwriting it
weather_data = pd.DataFrame(weather_data.items(), columns=['City', 'Temperature'])

print(weather_data)



In [None]:
#print dataframe columns using dictionary syntax
print(weather_data['Temperature'])

In [None]:
#print dataframe rows using row indexing using .loc() function
#look to the left of the results above. Those are row indexes

#just like a list
print(weather_data.loc[2])

print()

#or access it by value, just like a dictionary
#this is tricky because first you have to locate the value in
#the 'city' column
print(weather_data.loc[weather_data['City'] == 'Chicago'])

In [None]:
#let's make this a little more interesting. 
#I have included a file for all of you about Mike Trout, baseball player
#with his career statistics

#####ERASE THIS
#df = pd.read_csv('/Users/ep9k/Desktop/PythonDataViz-master/MikeTroutData.csv')


###USE THIS IN CLASS
df = pd.read_csv('MikeTroutData.csv')

print(df)

## Math with pandas dataframes
You can manipulate the data frames just like you would any other object.

In [None]:
#I am removing a few columns to make room for a new one
del df['Age']
del df['G']
del df['AB']

#I am renaming some of the columns just to make things look nice
home_runs = df['HR']
salary = df['Salary']
year = df['Year']

#Now create a new column based on values of other columns
df['pay_per_home_run'] = salary/home_runs
pay_per_home_run = df['pay_per_home_run']

print(df)


## [Matplotlib](https://matplotlib.org/)

Matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in python scripts, the python and IPython shells, Jupyter Notebook, etc. Matplotlib tries to make easy things easy and hard things possible.

From [Matplotlib's Wikipedia page](https://en.wikipedia.org/wiki/Matplotlib): Matplotlib is a plotting library for the python programming language and its numerical mathematics extension, NumPy. It provides an object-oriented API for embedding plots into applications.

In [None]:
%matplotlib notebook
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

plt.bar(year, pay_per_home_run)

In [None]:
#Now let's make the graph look nice. You can make it look any way you want!

fig, ax = plt.subplots()

plt.xlabel('Year')
plt.xticks(rotation=45)
plt.xticks(year)

formatter = ticker.FormatStrFormatter('$%.0f')     #formatting y axis as dollar amounts
ax.yaxis.set_major_formatter(formatter)

plt.ylabel('Price')           
plt.suptitle('Mike Trout Yearly Pay Per Home Run')
plt.bar(year, pay_per_home_run)
plt.show()