# Recommended Imports
```python
#Graph and plot stuff
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt 

#Math stuff
import numpy as np

#Stonks
from yfinance import ticker

#Scrape & Processing
from bs4 import BeautifulSoup
import requests

#Multithreading
import concurrent.futures as concur
```

# Reading Data

### Reading Stata Files
```python
df = pd.read_stata('stata_file.dta')
```
### Reading Excel Files
Warning: depending on your environment you may have to run
```bash
pip install openpyxl
```
After installation, you can run
```python
df = pd.read_excel('excel_file.xlsx', na_values='..', skipfooter=5)
```
### Reading CSV Files
I totally did not take this from my stonk freetime project ;)

You can use links or file locations to read csv
```python
url = "https://datahub.io/core/s-and-p-500-companies/r/constituents.csv"  # Replace with your dataset URL or file path
df = pd.read_csv(url)
```

# Seaborn
<a href="https://github.com/jramshur/Coding-Cheat-Sheets/blob/master/Python%20for%20Data%20Science%20-%20Cheat%20Sheet%20-%20Seaborn.pdf">Cheat sheet</a>

### Facet Grids
Facet grids are great for data exploration but should never be used in formal presentations since they can tell the wrong message.
Creates X*Y scatter plots with CAT number of data categories on each plot.

```python
g = sns.FacetGrid(to_plot, hue='CAT', col='X', row='Y')
g.map(plt.scatter, 'X', 'Y')
g.add_legend()
plt.show()
```

### Facet Grids with Regression Lines
You can also add regression lines to your facet grid plots with:

```python
g_countries = sns.FacetGrid(df_with_country, row='FUEL', col='COUNTRY')
g_countries.map(sns.regplot, 'MPG', 'WEIGHT')
plt.show()
```

### Regression and Scatter Plot with Despine
Removes external axis lines from plots
```python
fig, ax = plt.subplots(figsize=(10,5)) 

sns.regplot(x='xvals',                                        
            y='yvals',                                      
            data=df,                                       
            ax = ax,                                             
            color = 'black',
            #ci = 0 removes confidence intervals
            ci = 0)                                              

#Removes ugly ahhhhh extra lines
sns.despine(ax = ax)                             

# Labels
ax.set_title('title')
ax.set_ylabel('ylabel')
ax.set_xlabel('xlabel')

plt.show()

```

# Matplotlib
<img src="matplotlibcheatsheet.webp" />

### Histograms
```python
df_col.plot(kind='hist', bins=100, alpha=0.7, color='blue', edgecolor='black')
plt.title('Histogram of Scores')
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.grid(axis='y', alpha=0.75)
plt.show()
```