## Motivation
Before we begin downloading datasets and exploring them, we want to share the motivation behind this project. Throughout our master's graduate program, we had worked on clean, well-formatted datasets that had codebooks and proper documentation. We would create a few variables and jump right into the regressions. However, when we embarked on our thesis projects, depending on our question, we had to combine multiple publicly available datasets and we had to create these analytical datasets. This project attempts to demonstrate the process of how one could go about finding, cleaning and setting up an analytical dataset using Stata.

### Public Use Files (PUFs)

Here is a short list of publicly available files:

- https://data.detroitmi.gov/
- https://opendata.cityofnewyork.us/
- https://www.data.gov/
- https://www.google.com/publicdata/directory
- https://data.gov.sg/
- https://www.ebrd.com/cs/Satellite?c=Content&cid=1395236498263&d=Mobile&pagename=EBRD%2FContent%2FContentLayout
- http://erf.org.eg/
- https://www.cdc.gov/nchs/index.htm
- https://capstat.nyc/
- https://atlasdata.dartmouth.edu/

For this exercise, we will use the NYPD's Motor Vehicle Collisions data, which can be found on https://opendata.cityofnewyork.us/.

Stata's `import` command allows for several types of data. There are other user written commands that allow you to read in other types of data. For example:
- `insheetjson`
- `libjson`
- `spshape2dta`

In [None]:
import delimited "https://data.cityofnewyork.us/api/views/h9gi-nx95/rows.csv?accessType=DOWNLOAD", clear

We can use Stata's global macros to view information like dates and times by invoking `$S_DATE` and `$S_TIME`.

In [None]:
display "$S_DATE $S_TIME"

We can also include a note about the dataset that we have in memory. Adding notes is great practice and it allows us to store any kind of information about the dataset. In this example, we add a time stamp note.

In [None]:
notes: "Downloaded $S_DATE $S_TIME"
notes list

Relative file paths allow us to move from one folder (current directory) to another easily. Right now, the current directory is in the dofiles folder. We use `..\` to move up one level to the folder "Stata Class" and then down one level into "input_data" to save the dataset.

In [1]:
cd

C:\Users\jerem\Documents\Stata Class\dofiles


In [None]:
save "..\input_data\NYPD_Motor_Vehicle_Collisions.dta"