- Decide what dataset you'll use for your final project.
- Import the data into a Jupyter notebook using pandas.
- Document where you got the data. Tell me about the available data, with links to any available documentation. Tell me about the scope of the data: source, time period, what the data can and can't tell you.
- Then upload your notebook here. If you have a very large dataset, do not upload that dataset to bCourses! Just upload the notebook and share a link to the dataset with me in the Markdown.

# Import data

The data I'm using for this project is the [Street Tree List]('https://data.sfgov.org/City-Infrastructure/Street-Tree-List/tkzw-k3nq') from San Francisco Department of Public Works. I downloaded the data on Nov. 6, 2022. At the time, the data set was last updated on Nov. 6, 2022 as well. 

`Source`: DataSF, data provided by San Francisco Department of Public Works  

`Data Dictionary`: See [here](https://data.sfgov.org/api/views/tkzw-k3nq/files/biK1RHNRcrlnB42VCsuvdib3tybKjazIH4kuDcrOczw?download=true&filename=DPW_DataDictionary_Street-Tree-List.pdf)  

`Time period`: Created in Sep. 24, 2012. Updated daily. The earlist `PlantDate` is Sep. 19, 1955.   

`What it can tell me`:  
- Where the street trees are located in San Francisco?
- Which part of the city is covered by more trees? 
- What species of trees are planted in San Francisco?
- How tall are the trees? 
- Who are the caregivers of the tree? 
- When was the tree planted? What are the years when there were more trees planted?   

`What it can not tell me`:
- Demographic data about San Francisco. If I want to see which district / neighborhood has more trees, or calculate which part of the city has more trees per square mile, I still need to use census data or other data sets to compliment this. 
- For the caregiver of the trees, it only tells me if it's an agency or a person, but not the name of the person.


In [2]:
import pandas as pd
import altair as alt

In [4]:
sf_trees = pd.read_csv('Street_Tree_List.csv')

In [7]:
sf_trees.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 196590 entries, 0 to 196589
Data columns (total 23 columns):
 #   Column                     Non-Null Count   Dtype  
---  ------                     --------------   -----  
 0   TreeID                     196590 non-null  int64  
 1   qLegalStatus               196533 non-null  object 
 2   qSpecies                   196590 non-null  object 
 3   qAddress                   195097 non-null  object 
 4   SiteOrder                  194796 non-null  float64
 5   qSiteInfo                  196590 non-null  object 
 6   PlantType                  196590 non-null  object 
 7   qCaretaker                 196590 non-null  object 
 8   qCareAssistant             24707 non-null   object 
 9   PlantDate                  70878 non-null   object 
 10  DBH                        153021 non-null  float64
 11  PlotSize                   146229 non-null  object 
 12  PermitNotes                53367 non-null   object 
 13  XCoord                     19

In [8]:
# convert the `PlantDate` Dtype to datetime

sf_trees['PlantDate'] = pd.to_datetime(sf_trees['PlantDate'])

In [9]:
sf_trees.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 196590 entries, 0 to 196589
Data columns (total 23 columns):
 #   Column                     Non-Null Count   Dtype         
---  ------                     --------------   -----         
 0   TreeID                     196590 non-null  int64         
 1   qLegalStatus               196533 non-null  object        
 2   qSpecies                   196590 non-null  object        
 3   qAddress                   195097 non-null  object        
 4   SiteOrder                  194796 non-null  float64       
 5   qSiteInfo                  196590 non-null  object        
 6   PlantType                  196590 non-null  object        
 7   qCaretaker                 196590 non-null  object        
 8   qCareAssistant             24707 non-null   object        
 9   PlantDate                  70878 non-null   datetime64[ns]
 10  DBH                        153021 non-null  float64       
 11  PlotSize                   146229 non-null  object  

In [10]:
sf_trees

Unnamed: 0,TreeID,qLegalStatus,qSpecies,qAddress,SiteOrder,qSiteInfo,PlantType,qCaretaker,qCareAssistant,PlantDate,...,XCoord,YCoord,Latitude,Longitude,Location,Fire Prevention Districts,Police Districts,Supervisor Districts,Zip Codes,Neighborhoods (old)
0,217365,Section 806 (d),Ceanothus 'Ray Hartman' :: California Lilac 'R...,707 Rockdale Dr,1.0,Sidewalk: Property side : Yard,Tree,Private,,2021-10-14,...,5.997488e+06,2.098235e+06,37.741209,-122.451285,"(37.74120925101712, -122.45128526411095)",9.0,7.0,4.0,59.0,40.0
1,92771,DPW Maintained,Tristaniopsis laurina :: Swamp Myrtle,11X Blanken Ave,4.0,Sidewalk: Curb side : Cutout,Tree,Private,,2021-10-14,...,6.011718e+06,2.087394e+06,37.712247,-122.401320,"(37.712246915438215, -122.40132023435935)",10.0,3.0,8.0,309.0,1.0
2,23904,DPW Maintained,Prunus subhirtella 'Pendula' :: Weeping Cherry,1600X Webster St,6.0,Median : Cutout,Tree,DPW,,NaT,...,6.003596e+06,2.114195e+06,37.785380,-122.431304,"(37.78537959802679, -122.43130418097743)",13.0,9.0,11.0,29490.0,13.0
3,28646,DPW Maintained,Prunus subhirtella 'Pendula' :: Weeping Cherry,1600X Webster St,7.0,Median : Cutout,Tree,DPW,,NaT,...,6.003558e+06,2.114375e+06,37.785872,-122.431449,"(37.78587163716589, -122.43144931782685)",13.0,9.0,11.0,29490.0,13.0
4,229807,DPW Maintained,Jacaranda mimosifolia :: Jacaranda,2560 Bryant St,1.0,Sidewalk: Curb side : Cutout,Tree,Private,,NaT,...,6.009700e+06,2.102427e+06,37.753411,-122.409355,"(37.75341142310638, -122.40935530851043)",2.0,4.0,7.0,28859.0,19.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
196585,231304,Permitted Site,Metrosideros excelsa :: New Zealand Xmas Tree,365 Valley St,1.0,Sidewalk: Curb side : Cutout,Tree,Private,,1987-12-05,...,6.003550e+06,2.099202e+06,37.744210,-122.430394,"(37.74420986420245, -122.43039418027085)",2.0,7.0,5.0,63.0,22.0
196586,269391,Permitted Site,Ceanothus 'Ray Hartman' :: California Lilac 'R...,134 Aptos Ave,2.0,Sidewalk: Property side : Yard,Tree,Private,,2022-11-02,...,5.992950e+06,2.094085e+06,37.729555,-122.466675,"(37.7295545299999, -122.4666745393668)",9.0,8.0,4.0,59.0,40.0
196587,267351,Significant Tree,Magnolia grandiflora :: Southern Magnolia,295 Yerba Buena Ave,1.0,Sidewalk: Property side : Yard,Tree,Private,,2022-11-03,...,5.995171e+06,2.095630e+06,37.733924,-122.459107,"(37.73392381107357, -122.45910719357623)",9.0,7.0,4.0,59.0,40.0
196588,160078,DPW Maintained,Jacaranda mimosifolia :: Jacaranda,420 Otsego Ave,1.0,Sidewalk: Curb side : Cutout,Tree,Private,,2022-11-12,...,6.000125e+06,2.091352e+06,37.722460,-122.441674,"(37.72245966606217, -122.44167446391262)",9.0,7.0,6.0,28861.0,25.0
