# Loading Packages in Python

### What is a package
A **package** is a collection of functions that someone or some organization has created & published. There is a package repository called PyPi that is commonly used to publish packages for anyone to use. Additionally, packages can be published & downloaded from GitHub or other Git repositories. <br>

To import an entire package use the below syntax: <br>
- import math <br>

To import a specific function from a package use the below syntax: <br>
- from math import sqrt <br>

If only one or two functions will be used from a package (some packages could have 100s of functions), it is best practice to use the from *package* import *function*. This can avoid dependency issues where different packages may have functions with the same name. Additionally, it makes the code easier to read, as the reader will know exactly what function comes from each package. Typically this won't be an issue, so use your best judgement. <br>


### What is a function 
A **function** is a collection of processes designed to complete a specific process. For example, if I'm frequently having to calculate the square root of a number and add 5 & am tired of manually typing the steps every time, I can write a function to simplify the process. More details on functions will be given in a later section. 


### Final note - Aliases
When importing a package, it is common practice to provide the package name with an alias: <br>

import pandas as pd <br>

Python requires this syntax **package_name**.*function* to use a function from a specific package. The alias helps keep the code clean & easy to read. Typically, packages with short names do not use aliases: <br>
- import numpy as np<br>
- import math<br>

When using the *package* import *function* format, python does not require **package_name**.*function* to run a function. See the example below: 

In [3]:
# Import sqrt function from the math package
from math import sqrt

# Define square root function
def sq_root(num): 

    return sqrt(num) + 5

# Calculate the square root + 5
print(sq_root(100))

15.0


Many common packages are already installed when you download python, however, some packages do need to be installed. Packages should always be installed from the command line & not in a script - once installing to your machine one time, it will stay installed. Installing a package in a script would be redundant. <br>

To install a package, open the command prompt and use the below syntax:<br>
- py -m pip install *package*

### Loading Data

There are several methods & packages can be used to import data - this tutorial will focus on pandas. Pandas is one of the most popular packages used in python & therefore has a lot of documentation on how to use its functions. The following examples are based on loading data from csv & excel files. <br>

**Load csv files**<br>
pd.read_csv(*filepath*)<br><br>
**Load excel files**<br>
pd.read_excel(*filepath*)
<br>

**Note** that when typing a file path in python, you must use a "/" between folders or two forward slashes. Both work but typically the "/" is used. 

The example below loads data from a csv file:

In [19]:
### Load a csv file - loading an excel file follows the same process & will not be covered here

# Import pandas 
import pandas as pd

# Define the file path
file_path = '../../2022-fall-python-tutorial/data/2022_boxscores.csv'

# Load file
df = pd.read_csv(file_path)

# Print first few rows data
df.head(3)

Unnamed: 0,away_assist_percentage,away_assists,away_block_percentage,away_blocks,away_defensive_rating,away_defensive_rebound_percentage,away_defensive_rebounds,away_effective_field_goal_percentage,away_field_goal_attempts,away_field_goal_percentage,...,home_two_point_field_goals,home_win_percentage,home_wins,location,losing_abbr,losing_name,pace,winner,winning_abbr,winning_name
0,50.0,11,15.4,6,98.6,51.3,20,0.421,57,0.386,...,17,0.0,0,"Jon M. Huntsman Center, Salt Lake City, Utah",ABILENE-CHRISTIAN,Abilene Christian,71.0,Home,UTAH,Utah
1,68.8,22,0.0,0,101.3,67.7,21,0.507,73,0.438,...,15,0.0,0,"Reed Arena, College Station, Texas",ABILENE-CHRISTIAN,Abilene Christian,64.3,Home,TEXAS-AM,Texas A&M
2,59.1,13,0.0,0,86.6,76.7,23,0.381,67,0.328,...,18,0.0,0,"College Park Center, Arlington, Texas",TEXAS-ARLINGTON,UT Arlington,72.6,Away,ABILENE-CHRISTIAN,Abilene Christian


Data can also be loaded from text files, databases, APIs, and many other sources. Those are not covered in this tutorial, however, there is a lot of documentation online on the topic. If needed, Google *python import data from database/text file/API/etc.*

### Installing Packages

A number of packages come pre-installed in python, however, many will need to be manually installed. This can be accomplished uisng the terminal in VS Code. <br>

First, select the 'New Terminal' option in the 'Terminal' section of the top ribbon in VS Code. <br>

![](support-docs/new_terminal.JPG)

This will open up the terminal at the bottom of the screen: 

![](support-docs/current_terminal.JPG)

To install a package - use 'py -m pip install *package_name*'

![](support-docs/pandas_install.JPG)
<br><br>

After install, you will get a success message or it will have a detailed error message helping to explain next steps. 



#### Next Steps

Head back to the [Course Intro](./00_course_intro.ipynb) or to the next section: [Data Manipulations with Python](./04_data_manipulations_in_python.ipynb)