# UCLA ITS Data Camp, Pre-Course Exercise
## Getting Data
By now you should have a good understanding of the basics of the Python programming language and be familiar with the following concepts:
- Core Object Types
- Variables
- Loops
- Conditionals
- Basic Functions
- Data Structures (lists, dictionaries, etc)

For the first activity, we will use this knowledge to practice the first step in any data project: acquiring data.

### Course Exercise Format
Course **lectures** will consist of complete notebooks that we step through together as a class to understand how different inputs produce different outputs. Course **exercises** will consist of incomplete notebooks that you will complete on your own (you may ask others / myself for hints if you get stumped). 

The part of each exercise that needs to be completed will be marked with a `TODO` comment. If there is no `TODO`, then you should be able to run the cell without making any changes. For example, in the first code cell below, you will see that the csv library has already been imported for you; there is no `TODO` above the `import` line. However, the `TODO` two lines below specifies that you will need to complete the code to import each collision table using the [csv](https://docs.python.org/3/library/csv.html) library.

### Loading Data from Text Files
One of the most common ways to bring data into your project is by reading in plain-text files. Let's practice downloading data and opening it using Python's built-in `csv` module.

##### Step 1: Create a Project Directory
Create a basic project directory structure, similar to what was described in "Considerations for Data Projects", and have this notebook reside in the top-level, like this:

```
pre-course-prj/                     
├── data/                        
├── output/                      
└── Pre-Course Activity.ipynb    
```
##### Step 2: Download Collisions from LA City's GeoHub
The City of Los Angeles has an Open Data Portal specifically focused on spatial datasets, the GeoHub. As part of the City's initiative to end traffic deaths, called [Vision Zero](http://visionzero.lacity.org/), it recently posted updated data on traffic collisions within City boundaries. 

1. Go to each of the following links for collision data between 2009 and 2013: [Collisions](http://visionzero.geohub.lacity.org/datasets/ladot::collisions-2009-2013-switrs), [Parties](http://visionzero.geohub.lacity.org/datasets/ladot::party-tables-collisions-2009-2013-switrs), and [Victims](http://visionzero.geohub.lacity.org/datasets/ladot::victim-tables-collisions-2009-2013-switrs).

2. For each data source, click on "Download" and then "Spreadsheet."

3. Put all three collision layers into the "data" folder.

##### Step 3: Read-in the Collision Data

In [None]:
# Import packages
import csv

# TODO: Read in collision data (all 3 CSV tables)
collisions = 
parties = 
victims = 

In [None]:
# TODO: Complete the function to return a record count for each table. Don't count the header row!
def record_count(my_list):
    
    
# Run record_count() on the three tables. Which has the most records?
collision_record_ct = record_count(collisions)
party_record_ct = record_count(parties)
victim_record_ct = record_count(victims)

# Print out the results. Which has the most records?
print(f"The collision table has {collision_record_ct} records.")
print(f"The party table has {party_record_ct} records.")
print(f"The victim table has {victim_record_ct} records.")

### Loading Data from Python Packages
Another method for obtaining data is through Python packages themselves. 

##### Step 1: Install package (if needed)
Let's get [OpenStreetMap](https://www.openstreetmap.org) network data using Geoff Boeing's wonderful [osmnx](https://github.com/gboeing/osmnx) package. However, before we start, we are going to need to install `osmnx` since it is not included in either the Python standard library nor within the Anaconda distribution of Python. Following the instructions from the "Software Installation" and from the package's GitHub install page, go ahead and install the package. 

##### Step 2: Import package within project
Although we have installed the python package, we cannot yet use it until we import it within our project's environment. You can confirm the installation is successful by running the cells below to import the package. _A note about importing packages: You only need to import the python package once each time during each Python session. Once you close the notebook (and therefore shut down the python kernel), you will need to import the packages again the next time you start the notebook._

##### Step 3: Read Documentation for Specific Functions to download data
Once you've imported the package, check the package-specific documentation for how to download data. Let's start by getting some street network files for an area of your choosing by following the example [here](https://github.com/gboeing/osmnx-examples/blob/master/notebooks/02-example-osm-to-shapefile.ipynb). 

In [None]:
# TODO: Import the newly-installed osmnx package


# TODO: Get some amount of data


##### Step 4. Export as a shapefile to the 'output' project directory
The Shapefile format is one of the most common data formats for storing GIS data. Developed by ESRI, it is actually a collection of different files (usually about six) that contain the vector data, attribute infomation, projection, and other data. Take a look at [this notebook](https://github.com/gboeing/osmnx-examples/blob/master/notebooks/05-example-save-load-networks-shapes.ipynb) for an example of how to export data from osmnx to the Shapefile format for storage. Export the data you downloaded into the 'output' folder.

In [None]:
# TODO: Export to disk as a shapefile into the 'output' folder


##### Step 5. Confirm that you exported the shapefile correctly
There is an excellent website for quickly checking spatial data called [mapshaper](https://mapshaper.org/). If your file is not too big, try to view it here. Either drop your .shp file into the browser or use the file navigator in the window to point to your 'output' folder. 