#### Metered Parking in Boston
We are going to do some analysis on what metered parking is available in the Boston area, using data taken from Boston's [Open Data portal](https://data.boston.gov/dataset/parking-meters). 

This file is available in the repository as a [csv](https://www.computerhope.com/issues/ch001356.htm) (comma seperated value file, similar to the type of tabular data you would work with in excel).

### Exercise Notes:

   ##### For each technique, we will:
   - present an explanation which will include an example of the syntax.  `Syntax will be contained in code blocks like this.`
    Italicized portions of the example syntax should be replaced with the your variables.  Normal text (not italicized) should be copied precisely.
   - work through an example
   - allow you to practice a similar example on your own

### Step 1: Import the libraries you plan to use.

(This is done in the first lines of your script.  Always keep in mind that the script will run in order and won't have access to variables and functions set later in the file, just as you wouldn't be able to give someone the weather report if you hadn't looked it up yet.)

We will primarly use [pandas](https://pandas.pydata.org/pandas-docs/stable/reference/index.html).  This library allows us to easily manipulate and analyze data structures.

importing "as pd" allows us nickname pandas so that instead of typing the full name later, we can substitute "pd"  
Example: (pandas.dataframe.columns can instead be typed pd.dataframe.columns)

In [1]:
import pandas as pd

ModuleNotFoundError: No module named 'pandas'

![csv example](./data/meters_csv.png)

### Step 2: Loading a csv 
Pandas comes with built in functionality to read in a csv
The syntax is:  
`pd.read_csv('`*`file_path`*`')`

To make this file easier to refer back to later, we are going to save it to a variable name of our choice. I'm going to call it boston_meters.

In [None]:
# Remember, variable names cannot contain spaces, 
# To make the name more readable you can separate words with-a-dash or_with_underscores

boston_meters = pd.read_csv('./data/parking_meters_boston.csv')



### Step 3: Exploring the data
There are several techniques we can use to get a sense of what sort of data is available. 

Keep in mind that the code that is run will not automatically display results.  If you want the program to report back to you, you will need to wrap the command (or the variable it is saved to) in a print funtion


#### What columns does this csv have?
Let's take a look at the data available in the csv by printing the column headings.  The data structure is identical for the Charlestown and Boston dataframes.

The syntax is:
*`pd.dataframe.`*`columns`

In [None]:
# Remember we named our dataframe "boston_meters" in step 2
# Keep in mind that the code that is run will not automatically display results.  
#If you want the program to report back to you, you will need to wrap the command (or the variable) in a print funtion 

print(boston_meters.columns)

#### Finding all unique values
Dataframe 1 tells us which vendors service the meters in the "VENDOR" column. How many vendors service the boston area meters?  
Syntax: *`dataframe.column`*`.unique()`

In [None]:
print(boston_meters.VENDOR.unique())

In [None]:
# What are the distinct types of pay policies for meters? 

print(boston_meters.PAY_POLICY.unique())


### Reorganizing the Data

#### Filtering
Sometimes you may only want data with certain attributes. You can filter the data and save to a new dataframe or delete data from the table.  It can also be useful in cases where you want a count of the data that matches your query.


![string.contains documentation](./data/str-contains-method.png)  
One way to do this is to check if the cell contains a certain string (remember a string is a sequence of characters).
syntax: *`dataframe[dataframe['column']`*`.str.contains(`*`'string we are looking for'`*`)`

   This will return all result that evaluate to true.  In the next example we want all the results that *do not contain* a certain string.  We are in luck! We can easily invert our results by including *`~`* in front of the dataframe path like this: *`dataframe[~ dataframe['column']`*
   
   
   
Some additional methods include `str.startswith("")` and `str.endswith("")`


In [None]:
# let's find out what meters don't require payment on saturdays
# we have included the optional parameter "na=False" to exclude no data values, which can neither be true nor false

free_saturdays = meters_csv[~ meters_csv['PAY_POLICY'].str.contains('SAT', na=False)]
print(free_saturdays)

In [None]:
# Which meters don't enforce towing according to the dataset?
no_tow_zones = meters_csv[meters_csv['TOW_AWAY'].str.contains('none')]


##### Merging Dataframes  

![Merge Types](./data/merges.png)

We have a dataframe listing parking meters for Charlestown and another dataframe listing parking meters for Boston.
Try combining these two into one dataframe.


Syntax: *`dataframe`*`.merge(`*`dataframe_2`*`, how = "")`   (Default is inner merge)
