# Introduction to Python

Welcome! This notebook will help you start learning Python, especially a great start point for tasks related to data analytics. You will use Jupyter notebooks, a powerful tool that lets you write and run Python code interactively.

Python is a high-level, versatile programming language that is easy to read and write. It's widely used in data science, machine learning, web development, and many other fields.


## Software Requirements

To begin, make sure you have installed Python 3, which is the version of Python we'll be using. You'll also need Jupyter, a tool that lets you write Python code in a file called a 'notebook'. These notebooks make it easy to run code and see the results right away.

To run the exercises in this notebook, you need:

1. Python 3 installed on your computer.
2. Access to Jupyter Notebooks, where you can write and run your Python code.

You can install Python and Jupyter easily using the [Anaconda distribution](https://anaconda.com), which comes with both tools ready to use.

You can download and install  `Python3`   by following the instructions available at [anaconda.com](http://anaconda.com). 

Here are some ways to access Jupyter:

- Use [Google Colaboratory](https://colab.research.google.com/), a free online service that runs Python code in your web browser.
- Install Jupyter on your computer by following the instructions at [jupyter.org](https://jupyter.org/install).
- Use [JupyterHub](https://jupyter.org/hub) for collaborative work.

## Basic Commands

 In this lab, we will explore some basic Python commands to help you get started with programming. If you're new to Python and want to dive deeper, you can check out the official Python tutorial at [docs.python.org/3/tutorial/](https://docs.python.org/3/tutorial/), which provides a comprehensive guide for beginners.

 


## Variables and Data Types

Python supports several data types including integers, floats, strings, and booleans.

```python

Example of variables and data types:

## Basic Arithmetic Operations

You can use Python just like a calculator to perform basic arithmetic.

## Functions in Python

A function is a block of code that performs a specific task. You can create your own functions using the `def` keyword.

### Example
- `def greet(name):` defines a function named greet that takes one argument name.
- The function returns a greeting string using the argument.
- e.g., we call the function with the argument "World".

## Lists

*Lists* in Python are used to store multiple items in a single variable.

### List length
The length of a list, which is the number of elements composing the list, is caught by function `len()`, e.g., `len(my_list)`.

### Accessing elements of a list
### First element

#### Note:
Python is a zero-based indexing language, i.e., it starts off indexing using 0. Therefore, the first element of a list can be called using index 0.

#### Last element

As Python starts indexing elements from 0, the 2nd element index is 1, the 3rd index 2, and similarly the last element index is `len(my_list) - 1`.

Python recognizes index `-1` as the last element in a list. e.g., `my_list[-1]` returns the last element in a list.

### Adding elements to a list

`append()` is the function (aka method) to add elements into a list.

### Removing elements from a list

`remove()` is the function (aka method) that removes elements from a list.

### Slicing a List:
You can extract portions of the list using slicing.

## Array

If you need an `array` that can only hold one type of element (like integers or floats), Python’s `array` module can be used.

Importing the `array` Module:

`i` stands for "integer", which defines the type of elements the array will hold. Other types include `f` for floats.

### Accessing Elements:

### Adding and Removing Elements:

# Python Libraries

Libraries in Python are essential because they provide pre-built code that simplifies complex tasks. Instead of writing everything from scratch, you can use libraries to access ready-made functions and modules for tasks like data manipulation (e.g., `pandas`), numerical computation (e.g., numpy`), web development (e.g., `flask`), or machine learning (e.g., `scikit-learn`). This saves time, reduces errors, and allows you to focus on building solutions rather than reinventing common functionality.

## Numpy

`NumPy` (Numerical Python) is a powerful library in Python used for numerical computations. It provides support for arrays, matrices, and many mathematical functions, making it essential for data science, machine learning, and scientific computing.

In this lab, we'll go over the basics of `NumPy`, including array creation, indexing, and common operations.

First,  we `import` it into our Python environment:

### Creating Arrays

The core object in `NumPy` is the `array`, which is a grid of values, all of the same type. You can create arrays in several ways:

Creating Arrays from Lists:

### Creating Arrays with zeros, ones, and empty:

### Creating Arrays with arange and linspace:
`arange(start, stop, step)`: Creates an array with a range of values.
`linspace(start, stop, num)`: Creates an array with equally spaced values between the start and stop points.

### Multi-Dimensional Arrays:
You can create 2D or even multi-dimensional arrays using `NumPy`.

Example of creating a 2D array (matrix)

Example of creating an array (matrix) of zeros with 3 columns and 5 rows.

### Array Attributes
Once you have an array, you can access information about its shape, size, and data type.

`shape` returns the shape of the array (rows, columns)

`ndim` returns number of dimensions (2D in this case)

`size` returns the total number of elements

`dtype` returns the data type of the array elements

### Indexing and Slicing Arrays
`NumPy` arrays can be indexed just like lists in Python, and they support multi-dimensional indexing and slicing.

Accessing a single element

Slicing a 2D array

### Array Operations
`NumPy` makes it easy to perform element-wise operations on arrays.

Arithmetic Operations:

### Array Operations Between Arrays:

### Mathematical Functions:
`NumPy` provides a variety of built-in mathematical functions like sin, cos, exp, etc.

### Reshaping Arrays
You can change the shape of arrays using the `reshape()` method.

### Joining and Splitting Arrays
You can join or concatenate arrays, as well as split them.

Concatenating Arrays:

Splitting Arrays:

### Broadcasting in NumPy
Broadcasting is a powerful feature in `NumPy` that allows you to perform arithmetic operations on arrays of different shapes.

### Random Numbers with NumPy
`NumPy` provides a `random` module to generate random numbers.

Example of randomly generating an array of random numbers between 0 and 1

Example of randomly generating a 5-element array of integers between 1 and 5

### Aggregating Data
NumPy provides convenient functions for aggregating data, such as `sum`, `mean`, `max`, and `min`.

### Linear Algebra with NumPy
NumPy also has support for linear algebra operations such as matrix multiplication and solving linear systems.

Matrix multiplication:

Transpose of  a matrix:

## Pandas

Pandas is a powerful Python library used for data manipulation and analysis. It provides data structures like DataFrames and Series that make it easy to work with structured data (e.g., spreadsheets, CSV files, SQL tables) and perform operations such as data cleaning, filtering, and aggregation.

In this tutorial, we'll explore the basics of using Pandas, including DataFrames, Series, reading/writing data, and common operations.

First,  we `import` `pandas` into our Python environment:

### Pandas Data Structures
Pandas primarily uses two data structures:

- Series: A one-dimensional labeled array.
- DataFrame: A two-dimensional table of data (like a spreadsheet).
  
### Creating a Series:

A Pandas Series is similar to a list or a column in a table.

### Creating a DataFrame:
A Pandas DataFrame is a table with rows and columns. You can create it from dictionaries, lists, or by reading data from files.

### Reading Data into a DataFrame
Pandas can read data from various file formats such as CSV, Excel, JSON, and SQL databases.

Reading from a CSV file:

from [NFLsavant.com](https://nflsavant.com/about.php) we use 2024 play-by-play data.

Reading from an Excel file:

from [NYSERDA](www.nyserda.ny.gov) we use [Electric Vehicle (EV) and EV Charging Station Data](https://www.nyserda.ny.gov/All-Programs/Drive-Clean-Rebate-For-Electric-Cars-Program/About-Electric-Cars/Data-on-Electric-Vehicles-and-Charging-Stations).

### Writing Data to Files
Pandas allows you to easily export your DataFrame to various file formats.

Writing to a CSV file:

### Inspecting Data
Pandas offers several methods to quickly understand the structure and content of your DataFrame.

Viewing the Data:

Getting Column Names:

Summary Statistics:

### Selecting Data from a DataFrame
You can select specific rows and columns from a DataFrame in various ways.

Selecting Columns:

Selecting Rows:

You can use `.iloc[]` for positional indexing and `.loc[]` for label-based indexing.

Filtering Rows:

You can filter rows based on conditions.

### Data Cleaning
Pandas makes it easy to clean and handle missing or incorrect data.

Handling Missing Data:
You can handle missing data (NaN values) using the `dropna()` and `fillna()` methods.

Renaming Columns:

Removing Duplicates:

### Modifying Data
You can add, update, or delete columns in a DataFrame.

Adding a New Column:

Updating Column Values:

Deleting a Column:

### GroupBy and Aggregation

The `groupby()` function is used to group data based on a column, and you can apply aggregate functions like `sum()`, `mean()`, etc.

Aggregating Multiple Columns:

### Merging and Joining DataFrames
Pandas allows you to merge or join DataFrames, similar to SQL joins.

Merging DataFrames:

### Pivot Tables
Pandas allows you to create pivot tables for summarizing and analyzing data.

## Visualizations
In Python, the most commonly used library for creating visualizations is `matplotlib`. However, since Python wasn't initially designed with data analysis in mind, plotting functionality is not a built-in feature of the language. Instead, we use the `subplots()` function from the `matplotlib.pyplot` module to generate figures and axes for our plots.

If you're interested in exploring more examples of how to create various types of plots using Python, you can visit the official gallery at [matplotlib.org/stable/gallery/](https://matplotlib.org/stable/gallery/index.html).

In `matplotlib`, every plot is composed of two key elements: the figure and the axes. The figure can be seen as the overall space or canvas where one or more plots are drawn. The axes are where the actual data is plotted, and they contain additional elements like the labels for the x- and y-axes, the title, and other related information. Keep in mind that in `matplotlib`, axes refers to more than just the x- and y-axes; it encapsulates all components necessary for the plot.

To get started, we first import the `subplots()` function from `matplotlib.pyplot`, which we'll frequently use for creating plots. This function returns two objects: the first is the figure, and the second is the axes onto which we can plot our data. We typically specify the figure size using the `figsize` argument. Once the axes are created, we can begin plotting by using the `plot()` method. For more information about this method, you can type `ax.plot`? to learn how it works in more detail.

## Loops

Python supports loops to repeat a block of code multiple times. The most common types of loops are `for` loops and `while` loops.

### For loop

A `for` loop in Python is used to iterate over a sequence (like a list, tuple, string, or range) and execute a block of code for each item in that sequence. The key components of a `for` loop are:

1. `for` Keyword:
This keyword starts the loop and tells Python that you are about to iterate over a sequence.

2. Loop Variable:
The variable that takes each value in the sequence, one at a time. This is also known as the iterator, e.g., `i`.

3. `in` Keyword:
This keyword specifies the sequence you want to loop over.

4. Sequence/Iterable:
The collection of items you are iterating over, such as a list, range, tuple, or string.

5. Colon (`:`):
The colon indicates the start of the code block that will be executed on each iteration of the loop.

6. Loop Body/Block:
The indented code block that gets executed for each item in the sequence.

Example of `for` loop with a `range`:

### While loop

A `while` loop in Python repeatedly executes a block of code as long as a given condition remains `True`. Here are the key components of a `while` loop:

1. `while` Keyword:
This starts the loop and indicates that the block of code should keep running as long as the condition is `True`.

2. Condition:
The expression that is evaluated before each iteration. If the condition is `True`, the loop continues; if `False`, the loop stops.

3. Colon (`:`):
This marks the beginning of the loop body.

4. Loop Body:
The block of code (indented) that will be executed repeatedly as long as the condition is true.

5. Update or Change:
(Optional) The loop body typically includes a way to eventually make the condition false, so the loop can stop. This is important to prevent infinite loops.

Example of a `while` loop:

## Conditionals

Python uses `if`, `elif`, and `else` statements to make decisions in your code.