# Python Introduction Tutorial

This tutorial is intended to introduce python programming for beginners & explain common analytic concepts. Detailed math explanations are beyond the scope of this tutorial, however, links will be provided throughout that reference sites to additional research. 

#### Topics Covered
1. Installing Python & Jupyter Notebook
2. Base Python Operations
3. Loading Packages & Data with Python
4. Data Types
5. Data Manipulation with Python
6. Writing Functions & Loops in Python
7. Data Visualization with Python



## Install Python
Python is an open source programming tool that has become one of the most popular in the world due to its versatility & ease of use. To download python, navigate to https://www.python.org/downloads/ & select the version appropriate for your machine. As with most programs, there will be prompts that ask if any custom settings are needed - selecting the defaults are fine.

Congratulations! You have downloaded & installed python. Now the fun part - how to use python. Python has a number of IDE (Integrated Development Environment) options. If this is a new term, think "user-interface". A few options are listed below - some are easier to get setup than others. In my opinion, whatever you can figure out to install & use is the best choice. 

1. JupyterLab / Jupyter Notebook (technically different things)
2. Visual Studio Code
3. Atom
4. Command Prompt
5. Most text editors.

Installing JupyterLab / Jupyter Notebooks is very straightforward & is great for making Markdown files. So I'll run through that example. To install on Windows: 

1. Type Command Prompt into your 'Start Menu'. 
2. Type "py" - this starts python. 
3. Type "pip install jupyterlab"
4. A few questions may pop up, type 'yes' as needed.
5. After installation, open up your 'Start Menu' again & search for JupyterLab. 
6. The user-interface will launch & you'll be able to create python scripts from scratch. 

## Base Python Operations
Sure, python can do a lot of fancy things. However, at its core, python is still a programming language. To do those fancy things, there needs to be building blocks. This section will cover some base functionality that will seem trivial but will ultimately be useful as you begin to develop more complex programs.

##### Basic Operators
1. '+' Addition
2. '-' Subtraction
3. '*' Multiplication
4. '/' Division
5. ** Exponent
6. '%' Modulus
7. '//' Floor Division

In [None]:
### Python Addition
4 + 4

In [None]:
### Python Subtraction
10 - 5

In [None]:
### Python Multiplcation
5 * 5

In [None]:
### Python Division
40 / 8

In [None]:
### Python Exponent
3**2

In [None]:
### Python Modulus
# Modulus = remainder
23 % 5

In [None]:
### Python Floor Division
# Division Rounded Down
23 // 5

All of these operators can be combined together in a continuous string. The operations will be executed following PEMDAS rules. What will 8*((9-4)%2)+1 = ? Explain why.

In [None]:
### Multiple Operations Example
8*((9-4)%2)+1

## Loading Packages & Data with Python

### Loading Packages
A package is a collection of functions that someone or some organization made to simplify frequently used processes. For example, if I'm frequently having to calculate the square root of a number and add 5 & am tired of manually typing the steps every time, I can write a function to simplify the process: 

def sq_root(num int): <br>
&nbsp;&nbsp;&nbsp;&nbsp;   sqrt(num) + 5
    
Now consider I have to do the same process 1000 times with 1000 different numbers. Rather than type it out 1000 times, I can make 1000 functions (or one more complex function) and then package them, so I don't have to recreate them. This is a simple example, but the idea stands - a package is a grouping of functions that can be imported to your working session. 

To import a function use the syntax import (package) as (alias). Replace (package) with the actual name of the package and (alias) with an abbreviation. Aliases are optional but convenient as they can reference packages with fewer keystrokes. More on that later. 

Many common packages are already installed when you download python, however, some packages do need to be installed. This can be directly completed in a python script or it can be completed using the command prompt. Packages only need to be installed once, so it is best practice to remove package installations from your scripts.

In [1]:
### Import pandas & numpy packages

# Import pandas
import pandas as pd

# Import Numpy
import numpy as np

In [None]:
### Install a pip package in the current Jupyter kernel

# Import sys package
import sys

# Run sys.executable to install the 
# desired package. numpy in this case.
!{sys.executable} -m pip install numpy

### Loading Data

Several packages can be used to import data - this tutorial will focus on pandas. Pandas is one of the most popular packages used in python & therefore has a lot of documentation on how to use its functions. The following examples are based on loading data from csv & excel files. 

<br>

**Load csv files**
<br>
read_csv(*filepath*)
<br>
<br>
**Load excel files**
<br>
read_excel(*filepath*)



In [6]:
### Load a csv file
# Loading an excel file follows the same process & will not be covered here

# Define the file path
filepath_csv = 'C:/Users/JoeRatterman/Documents/GitHub/MarchMadness2021/boxscores/2021_boxscores.csv'

# Load file
df = pd.read_csv(filepath_csv)

# Print first few rows data
df.head(3)

Unnamed: 0,away_assist_percentage,away_assists,away_block_percentage,away_blocks,away_defensive_rating,away_defensive_rebound_percentage,away_defensive_rebounds,away_effective_field_goal_percentage,away_field_goal_attempts,away_field_goal_percentage,...,home_two_point_field_goals,home_win_percentage,home_wins,location,losing_abbr,losing_name,pace,winner,winning_abbr,winning_name
0,41.4,12,2.4,1,96.0,64.5,20,0.594,53,0.547,...,22,0.0,0,"Germain Arena, Estero, Florida",AUSTIN-PEAY,Austin Peay,75.2,Away,ABILENE-CHRISTIAN,Abilene Christian
1,57.7,15,0.0,0,78.4,81.5,22,0.5,58,0.448,...,21,0.0,0,"Germain Arena, Estero, Florida",NEBRASKA-OMAHA,Omaha,73.8,Away,ABILENE-CHRISTIAN,Abilene Christian
2,52.9,9,8.1,3,98.8,56.4,22,0.4,50,0.34,...,21,0.0,0,"Moody Coliseum , Abilene, Texas",Howard Payne\n\t\t\t,Howard Payne\n\t\t\t,81.6,Home,ABILENE-CHRISTIAN,Abilene Christian


Data can also be read from databases. The process for this is similar, however, it can require different packages. The appropriate package is dependent on the goal of the data load. There are packages that can connect to a database and read a table or execute a stored procedure & there are even packages that allow you to code SQL in your python environment. Using SQL in python will not be covered in this tutorial, but if you're interested search Google for sqlite3.




In [7]:

import pyodbc
import pandas

cnxn = pyodbc.connect(r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};'
                      r'DBQ=C:\users\bartogre\desktop\data.mdb;')
sql = "Select sum(CYTM), sum(PYTM), BRAND From data Group By BRAND"
data = pandas.read_sql(sql,cnxn)


SystemError: <class 'pyodbc.Error'> returned a result with an error set