# **Python Basics for City Data**
## An Introductory Tutorial Using Chicago's Municipal Datasets
***
## INTRODUCTION
***

### **Overview**

This introduction to the tutorial will provide a concise overview of Python and its benefits. In this notebook, we will also set up some fundamental Python tools (libraries) that are essential for the upcoming tutorials and many Python projects. 

For those already familiar with or uninterested in the background of what Python and Jupyter Notebooks are, **this introduction is optional.**
However, make sure **all of the necessary libraries (listed in the last section of this notebook) are installed to complete the rest of the tutorial.**

<br>

***

#### **What is Python?**
##### Python is a versatile, high-level programming language known for its readability and efficiency. 
<br>

***

#### **Why Use Python?** <br>
Below are four reasons (among many) why Python is a great tool for data analysts, scientists, and profesionals:



**1. Handles Big Data:** Python can work with much larger amounts of data than Excel, which is essential for city-scale data.

**2. Ready-to-Use Tools:** It has a lot of built-in tools (called 'libraries') that make common data tasks, like organizing and visualizing data, much easier and faster.

**3. Saves Time:** Python can automate repetitive data tasks, reducing time and effort.

**4. Strong Support Community:** A large number of users and experts contribute to a supportive community, making it easier to find help and resources.

<br>

***

#### **Python Advantages** <br>
##### Data professionals frequently rely on tools like Excel and R for data analysis. However, there are scenarios where Python may be a more suitable choice, either as a complement or as a replacement for these tools. Here are the key reasons that make Python a compelling option:


* **Python vs. Excel:** Python stands out as a superior choice for city data professionals for several reasons. Its scripting capability meticulously logs every analysis step, facilitating easy replication with everything from cleaning data to robust data analysis - a stark contrast to Excel, where tracing steps can be challenging. Code sharing in Python **promotes consistent, transparent outcomes.** Additionally, Python's adaptability in integrating with multiple data sources significantly broadens its applicability across various projects, unlike Excel.

* **Python vs. R:** The choice between Python and R for data analysis often hinges on the project's specific requirements and the user's comfort with each language, as both have their merits. However, Python is often favored over R for several reasons: its status as a general-purpose language makes it suitable for a broad array of tasks beyond data analysis, including automation and software development. Python's scalability means it can handle large-scale data processing tasks more efficiently than R. Additionally, Python is widely recognized for its strengths in Machine Learning and Artificial Intelligence, supported by robust libraries like TensorFlow and Scikit-learn, whereas R is traditionally known for its statistical analysis capabilities.

<br>

***

#### **Jupyter Notebooks** <br>
##### Python is often  written in a basic Python script, which is typically saved with a **'.py'** extension. Scripts are often used for automating tasks and running applications. 

##### This tutorial is written using a **Jupyter Notebook"** (typcially saved with a **'.ipynb'** extension) which is an interactive application we can write Python in. 

##### Both Jupyter Notebooks and basic Python scripts are useful in data science and data analystics. We might choose to use a Jupyter notebook over a Python script when analyzing city data because: 

* **Interactivity:** Jupyter Notebooks allow for interactive coding, where you can write, run, and modify code in chunks (cells) and see the output immediately. <br>

* **Integration of Code and Documentation:** Notebooks enable the integration of code with rich text, equations, and visualizations, making them ideal for data analysis, teaching, and presenting results.

* **Ease of Visualization:** They make it easy to create and display plots and charts inline, directly below the code that generates them.

* **Experimentation and Exploration:** Notebooks are great for experimenting with code and data due to their interactive nature, allowing for immediate feedback and iterative exploration.

* **Sharing and Collaboration:** Jupyter Notebooks are easily shareable as complete computational narratives, making them useful for collaborative projects and educational purposes.


<br>

***

#### **Installing Basic Python Libaries Step-by-Step** <br>
##### There are many very useful libaries in Python. We will only be using a few of the basic Python libaries that are widely used in almost all Python projects. The libaries we will install are:

* Numpy
* Pandas
* Matplotlib
* Scikit-Learn

**Select each cell below and press "Shift" and "Enter" at the same time to run the code and install the libary**

It may take a few minutes to install.

In [None]:
pip install numpy

In [None]:
pip install pandas

In [None]:
pip install matplotlib

In [None]:
pip install -U scikit-learn

**If the outputs all say that the packages were successfully installed, then congratulations! You have all of the necessary libraries and you can complete the rest of the tutorials starting with Part 1.**