## Python 101: Installing and Getting Started with Python at Urban 

There are countless ways to install Python, which can make getting started notoriously intimidating. For new users at Urban, we recommend [Anaconda](https://docs.anaconda.com/anaconda/user-guide/getting-started/) as a great way to quickly set up and manage your Python environment. We recommend this approach because the Anaconda Distribution comes pre-installed with many useful packages for data science, and makes it much easier to manage environments and dependencies. 

In this guide, we install Anaconda, launch the Anaconda Navigator, and learn how to use Spyder and Jupyter notebooks to start writing Python code. 

---
<div>
<img src="https://imgs.xkcd.com/comics/python_environment.png" width="400"/>
</div>

---

**1. Install Anaconda**. Step-by-step download instructions are available from the Anaconda Download page for [Windows](https://docs.anaconda.com/anaconda/install/windows/) and [macOS](https://docs.anaconda.com/anaconda/install/mac-os/). We recommend following the suggested defaults (i.e., install for "Just Me", use the default destination folder, and do not add Anaconda to your PATH environment). 

Note: With these settings, you do not need administrator privileges to install Anaconda on your Urban computer. 

**2. Open the Anaconda Navigator**. Once you have installed Anaconda, open Anaconda Navigator.  Navigator is a desktop graphical user interface that lets you launch applications and easily manage conda packages and environments without using command-line commands. On a Windows machine, click Start and select Anaconda Navigator from the menu. On a Mac, use Spotlight to search for Anaconda Navigator. 

For more information about using Anaconda Navigator, see the [User Guide](https://docs.anaconda.com/anaconda/navigator/) and [Cheat Sheet](https://docs.anaconda.com/_downloads/9ee215ff15fde24bf01791d719084950/Anaconda-Starter-Guide.pdf). 

**3. Launch Spyder**. Spyder is an integrated developer environment (IDE) included with Anaconda (similar to R Studio). You can [launch Spyder from the Anaconda Navigator](https://docs.anaconda.com/anaconda/user-guide/getting-started/#run-python-in-spyder-ide-integrated-development-environment). After launching Spyder, note the following panes in Spyder's interface: 
* Editor pane (left): create, open, and edit files with features like autocompletion and syntax highlighting 
* IPython console (bottom right): run code interactively 
* Help pane (top right): read documentation for Python objects 
* Variable Explorer (top right): browse and interact with Python objects 

For more information about using Spyder, see the [Quickstart Guide](https://docs.spyder-ide.org/current/quickstart.html) and [Intro Videos](https://docs.spyder-ide.org/current/videos/first-steps-with-spyder.html). 

**4. Launch a Jupyter Notebook**. Jupyter notebooks are web-based applications that let you create, edit, run, and share code in an interactive notebook alongside text and visualizations (similar to R Markdown). Like Spyder, Jupyter comes installed with Anaconda and can be [launched from the Anaconda Navigator](https://docs.anaconda.com/anaconda/user-guide/getting-started/#run-python-in-a-jupyter-notebook). After launching Jupyter, a new localhost browser window will open. From there, you can create a new Python 3 (ipykernel) notebook. 

For more information about using Jupyter notebooks, see the [What is Jupyter Notebook?](https://github.com/jupyter/notebook/blob/main/docs/source/examples/Notebook/What%20is%20the%20Jupyter%20Notebook.ipynb) and [Notebook Basics](https://github.com/jupyter/notebook/blob/main/docs/source/examples/Notebook/Notebook%20Basics.ipynb) guides. 

**5. Start Coding in Python**. Now that you have Anaconda installed and can launch Spyder and a Jupyter notebook, you're ready to start writing Python code. Here, we'll write simple `hello_world()` functions and work with a CSV file. Try running the code in both Spyder and in a Jupyter notebook. 

**Note**: The goal of this guide is to get Python installed on your computer using Anaconda, verify that you can launch the Anaconda Navigator, and then practice running Python code in Spyder and a Jupyter notebook. You do NOT need to understand what each line below is doing. The Python Users Group will hold future sessions where we will dive into Python syntax, data types, and libraries for working with data (including `pandas` and `numpy`). 

---

First, define a simple function called `hello_world()`. 

In [1]:
def hello_world(): 
    print("Hello, world!")

Then, call the function and note that it prints out the text "Hello, world!" 

In [2]:
hello_world()

Hello, world!


Next, define a similar function called `hello_name()` that says hello to the `name` passed into the function as an argument. 

In [3]:
def hello_name(name):
    print(f"Hello, {name}!")

Now, call the function passing any name into the `name` argument of the function. 

In [4]:
hello_name(name="Erika")

Hello, Erika!


Now, let's write Python code to read a CSV file. Here, we're reading a CSV file from the NYC Open Data Portal of the [2018 Central Park Squirrel Census](https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/vfnx-vebw). From the dataset description: 
> [The Squirrel Census](https://www.thesquirrelcensus.com/) is a multimedia science, design, and storytelling project focusing on the Eastern gray (Sciurus carolinensis). They count squirrels and present their findings to the public. This table contains squirrel data for each of the 3,023 sightings, including location coordinates, age, primary and secondary fur color, elevation, activities, communications, and interactions between squirrels and with humans.

First, we need to import the `pandas` library. Note that `pandas` is imported as `pd` by convention using the syntax below. Because we installed Python using Anaconda, we already have `pandas` (and many other useful libraries) installed. 

In [5]:
import pandas as pd 

Next, we can define the URL and use the `pd.read_csv()` function to read the CSV. 

In [6]:
# Define URL for the 2018 Central Park Squirrel Census
squirrels_csv_url = "https://data.cityofnewyork.us/api/views/vfnx-vebw/rows.csv"

# Read CSV using pandas 
squirrels_df = pd.read_csv(squirrels_csv_url)

The next line prints the dimensions of the dataset. It has 3,023 rows and 31 columns. 

In [7]:
# Print size of dataframe 
print(squirrels_df.shape)

(3023, 31)


Next, we output the first 5 rows of the dataset. In Spyder, try using the Variable Explorer to view the dataset. 

In [8]:
# Print the first 5 rows of the dataframe
squirrels_df.head()

Unnamed: 0,X,Y,Unique Squirrel ID,Hectare,Shift,Date,Hectare Squirrel Number,Age,Primary Fur Color,Highlight Fur Color,...,Kuks,Quaas,Moans,Tail flags,Tail twitches,Approaches,Indifferent,Runs from,Other Interactions,Lat/Long
0,-73.956134,40.794082,37F-PM-1014-03,37F,PM,10142018,3,,,,...,False,False,False,False,False,False,False,False,,POINT (-73.9561344937861 40.7940823884086)
1,-73.968857,40.783783,21B-AM-1019-04,21B,AM,10192018,4,,,,...,False,False,False,False,False,False,False,False,,POINT (-73.9688574691102 40.7837825208444)
2,-73.974281,40.775534,11B-PM-1014-08,11B,PM,10142018,8,,Gray,,...,False,False,False,False,False,False,False,False,,POINT (-73.97428114848522 40.775533619083)
3,-73.959641,40.790313,32E-PM-1017-14,32E,PM,10172018,14,Adult,Gray,,...,False,False,False,False,False,False,False,True,,POINT (-73.9596413903948 40.7903128889029)
4,-73.970268,40.776213,13E-AM-1017-05,13E,AM,10172018,5,Adult,Gray,Cinnamon,...,False,False,False,False,False,False,False,False,,POINT (-73.9702676472613 40.7762126854894)


By default, only 20 columns are shown. We can tell `pandas` to show all columns using the following syntax, and then re-running the `head()`, which can be particularly useful in Jupyter notebooks. 

In [9]:
# Display all pandas columns  
pd.set_option('display.max_columns', None)

# Print the first 5 rows of the dataframe
squirrels_df.head(5)

Unnamed: 0,X,Y,Unique Squirrel ID,Hectare,Shift,Date,Hectare Squirrel Number,Age,Primary Fur Color,Highlight Fur Color,Combination of Primary and Highlight Color,Color notes,Location,Above Ground Sighter Measurement,Specific Location,Running,Chasing,Climbing,Eating,Foraging,Other Activities,Kuks,Quaas,Moans,Tail flags,Tail twitches,Approaches,Indifferent,Runs from,Other Interactions,Lat/Long
0,-73.956134,40.794082,37F-PM-1014-03,37F,PM,10142018,3,,,,+,,,,,False,False,False,False,False,,False,False,False,False,False,False,False,False,,POINT (-73.9561344937861 40.7940823884086)
1,-73.968857,40.783783,21B-AM-1019-04,21B,AM,10192018,4,,,,+,,,,,False,False,False,False,False,,False,False,False,False,False,False,False,False,,POINT (-73.9688574691102 40.7837825208444)
2,-73.974281,40.775534,11B-PM-1014-08,11B,PM,10142018,8,,Gray,,Gray+,,Above Ground,10.0,,False,True,False,False,False,,False,False,False,False,False,False,False,False,,POINT (-73.97428114848522 40.775533619083)
3,-73.959641,40.790313,32E-PM-1017-14,32E,PM,10172018,14,Adult,Gray,,Gray+,Nothing selected as Primary. Gray selected as ...,,,,False,False,False,True,True,,False,False,False,False,False,False,False,True,,POINT (-73.9596413903948 40.7903128889029)
4,-73.970268,40.776213,13E-AM-1017-05,13E,AM,10172018,5,Adult,Gray,Cinnamon,Gray+Cinnamon,,Above Ground,,on tree stump,False,False,False,False,True,,False,False,False,False,False,False,False,False,,POINT (-73.9702676472613 40.7762126854894)


Lastly, we can write the dataframe out to a CSV locally using the `pd.to_csv()` function. 

In [10]:
# Write the CSV locally 
squirrels_df.to_csv('squirrels_census_2018.csv')