# ADAPT Pro - Topic 2 - Financial Analysis with Python

**Before we get Started**

- materials also on github: https://github.com/TheMarqueeGroup/ADAPTPro-Topic2/
- run the getting started file:
    - launch jupyter: `go/jupyter`
    - `jupyter/notebooks/lob/core/ADAPT/GettingStarted.ipynb`
- demo codes located in: `jupyter/notebooks/lob/core/ADAPT_Pro/FinAnalysis/`


**Agenda Today**
- K-Mean Demo - unsupervised learning
- Decision Trees - supervised
- Portfolio Optimization

## K-Mean Demo
- demo located: `~bogdan.tudose/python;rps:/jupyter/notebooks/lob/core/ADAPT_Pro/FinAnalysis/KMeanDemo.ipynb`

### Import packages

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px #viz package for interactive charts

from sklearn.cluster import KMeans #ML Algo

### Dummy Data - Companies with different Debt, Margin, Sales Growth

In [2]:
cos = ['A','B','C','D','E','F','G','H','I', 'J']
d_ebitda = [4.5, 5, 3.75, 4, 1.0, 0.5, 0, 6, 7, 6.5]
sales_growth = [15, 10, 12, 11, 40, 60, 35, -1, -2, 0]
margins = [60, 80, 60, 40, 15, 10, 0, 15, 20, 5]

df = pd.DataFrame({'Company':cos,'D/EBITDA':d_ebitda,'Sales YoY':sales_growth,'EBITDA Margin':margins})
df

In [3]:
#Quick Visualizations
df.plot(x='D/EBITDA',y='Sales YoY', kind='scatter')
df.plot(x='D/EBITDA',y='EBITDA Margin', kind='scatter')
df.plot(x='Sales YoY',y='EBITDA Margin', kind='scatter')

In [4]:
sns.pairplot(df) #shows scatter permutations between all columns

In [6]:
X = df[['EBITDA Margin','D/EBITDA','Sales YoY']]

In [7]:
sse = []
for k in range(1,11): #number of clusters
    kmeans = KMeans(n_clusters = k)
    kmeans.fit(X)
    
    sse.append(kmeans.inertia_)

#Quick plot of Elbow Curve
plt.plot(range(1,len(sse)+1), sse)
plt.title("Elbow Curve")
plt.xlabel('Number of Clusters')
plt.ylabel('SSE')
plt.show()

### Run the K-Means Algo with Optimal k=3

In [14]:
kmeans = KMeans(n_clusters = 3)
kmeans.fit(X)

In [15]:
kmeans.labels_

In [17]:
df['Category'] = kmeans.labels_
# df

In [21]:
centroids = kmeans.cluster_centers_

In [22]:
centroids_df = pd.DataFrame(centroids)
# centroids_df.columns = X.columns
# cenrtoids_df
centroids_df.columns = X.columns
centroids_df

In [24]:
df

In [26]:
x = 'EBITDA Margin'
y = 'Sales YoY'
# plt.scatter(centroids[:,0], centroids[:,1], c = 'black', marker='x')
plt.scatter(centroids_df[x], centroids_df[y], c = 'black', marker='x')
plt.scatter(df[x], df[y], c = df['Category'], cmap ="rainbow") 
        # https://matplotlib.org/stable/tutorials/colors/colormaps.html
plt.xlabel(x)
plt.ylabel(y)
plt.show()

## Decision Tree Demo
- path to notebook: `~bogdan.tudose/python;rps:/jupyter/notebooks/lob/core/ADAPT_Pro/FinAnalysis/DecisionTreeDemo.ipynb`