## Installing External Packages (Using pip)

For example
- numpy
- pandas
- matplotlib
- seaborn
- folium
- geopy
- plotly

### Reference

- https://realpython.com/what-is-pip/

## pip

- pip is an acronym for "pip installs packages"
- pip is a package manager for Python
- It was created in 2008
- Package management is so important that Python’s installers have included pip since versions 3.4 and 2.7.9, for Python 3 and Python 2, respectively
- Use should run "pip" as a module, meaning "python -m pip"

## Installing packages

- Python has a comprehensive standard library and has an active community that contributes an even more extensive set of packages
- These "other" packages are published to the Python Package Index, also known as PyPI (https://pypi.org/)
- By default, pip installs the latest version of the package
- Multiple packages can be installed "python -m pip install \<pkg1> \<pkg2>"
- By default, uses PyPI to look for packages, but you can look elsewhere, for example
- "python -m pip install -i https://test.pypi.org/simple/ \<pkg>"
- pip can also install packages from git repos

In [None]:
# numpy is short for Numerical Python
!python -m pip install numpy

## After installing numpy (which, by defaut installs the latest version)

![numpy Install](gfx/install-numpy.png)

A new package shows up in site-packages directory

In [None]:
!python -m pip list

In [None]:
# show information about numpy
!python -m pip show numpy

In [None]:
# import and display version
import numpy as np
np.__version__

In [None]:
# import and display package location
import numpy as np
np.__file__

In [None]:
# numpy example
import numpy as np

# Create a 1D array
arr_1d = np.array([1, 2, 3, 4, 5])
print(f"1D Array: {arr_1d}")

# Create a 2D array (matrix)
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"2D Array:\n{arr_2d}")

# Check the type and shape of an array
print(f"Type of arr_1d: {type(arr_1d)}")
print(f"Shape of arr_2d: {arr_2d.shape}")

In [None]:
# data analysis and manipulation tool
!python -m pip install pandas

In [None]:
# import and display version
import pandas as pd
pd.__version__

In [None]:
import pandas as pd
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print(s)

In [None]:
import pandas as pd
dates = pd.date_range("20130101", periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
df

In [None]:
# install a specific version
!python -m pip install matplotlib==3.8.4

In [None]:
import matplotlib
matplotlib.__version__

In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

plt.plot(x, y)
plt.show()

In [None]:
!python -m pip install seaborn

In [None]:
# import seaborn 
import seaborn as sns 

# loading dataset 
data = sns.load_dataset("iris") 

# draw lineplot 
sns.lineplot(x="sepal_length", y="sepal_width", data=data) 

In [None]:
# what datasets are available?
import seaborn as sns
sns.get_dataset_names()

In [None]:
# seaborn with matplotlib
import seaborn as sns 
import matplotlib.pyplot as plt 

# loading dataset 
data = sns.load_dataset("iris") 

# draw lineplot 
sns.lineplot(x="sepal_length", y="sepal_width", data=data) 

# setting the title using Matplotlib
plt.title('Title using Matplotlib Function')

plt.show()

In [None]:
# interactive, and browser-based graphing library
!python -m pip install plotly

In [None]:
import plotly.express as px
fig = px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16])
fig.show()

In [None]:
# install everything listed in file
# Requirements file format: https://pip.pypa.io/en/stable/reference/requirements-file-format/
!python -m pip install -r requirements.txt

In [None]:
# generate the requiements file
!python -m pip freeze > new_requirements.txt

# numpy

### Reference

- https://numpy.org/

In [None]:
# now old
# from numpy import npv, irr
!python -m pip install numpy-financial

In [None]:
import numpy_financial as npf
npf.irr([-250000, 100000, 150000, 200000, 250000, 300000])

In [None]:
# import numpy and numpy_financial
import numpy as np
import numpy_financial as npf

# create numpy array "x"
x = np.array([-250000, 100000, 150000, 200000, 250000, 300000])
print(x)
print(x.dtype)
# create numpy array "y"
y = np.array([1.1, 2.2, 3.3, 4.4])
print(y)
print(y.dtype)
print(npf.irr(x))

In [None]:
# create a pandas DataFrame from numpy array
import numpy as np
import pandas as pd
# create numpy array "x"
x = np.array([-250000, 100000, 150000, 200000, 250000, 300000])
df = pd.DataFrame(x)
print(df)

In [None]:
import matplotlib.pyplot as plt
plt.plot(df)
plt.show()