# Fundamentals

## 1) Planning

Practice the following concepts covered in the Module:

- Work with a brazillian variable income database (Stocks);

- Work with indicators of fundamental analysis of stocks (Price, Liquidity, P/E, DY);

- Work with two tools used in data science (Excel and Python);

- Perform exploratory analysis of variable income data;

- Exercise with some basic Python commands for data analysis and graphing.

## 2) Collecting the data

In [None]:
#@title 2.1) Read the downloaded CSV file
import pandas as pd

stocks = pd.read_csv("../data/raw/stocks.csv", sep=";")

## 3) Exploratory analysis

In [None]:
#@title 3.1) First values
stocks.head()

In [None]:
#@title 3.2) Last values
stocks.tail()

In [None]:
#@title 3.3) Rows and columns count
stocks.shape

In [None]:
#@title 3.4) Rows and columns count
stocks.columns

In [None]:
#@title 3.5.1) Columns types
stocks.dtypes

In [None]:
#@title 3.5.2) Convert columns to appropriate types
stocks.replace("\.", "", regex=True, inplace=True)
stocks.replace(",", ".", regex=True, inplace=True)

stocks = stocks.apply(pd.to_numeric, errors="ignore")
stocks = stocks.convert_dtypes()

stocks.dtypes

In [None]:
#@title 3.6) Statistical summary for numerical columns
stocks.describe()

In [None]:
#@title 3.7) Number of missing values in each column
stocks.isnull().sum()

In [None]:
#@title 3.8) Number of unique values
stocks.nunique()

In [None]:
#@title 3.9) How much memory each column uses in bytes
stocks.memory_usage()

In [None]:
#@title 3.10) Stocks with higher prices
stocks.nlargest(5, "PRECO")

## 4) Cleaning the data

In [None]:
#@title 4.1) Fill empty spaces with zero
stocks.fillna(0, inplace=True)

In [None]:
#@title 4.2) Drop rows outside price standard (above 1000 or equals to 0)
stocks.drop(stocks[(stocks.PRECO == 0) | (stocks.PRECO > 1000)].index, inplace=True)


## 5) Analysing the data

In [None]:
#@title 5.1) Asset with the highest price
print("Highest price: ")
print(stocks.nlargest(1, "PRECO")[["TICKER", "PRECO"]])

In [None]:
#@title 5.2) 10 highest and 10 Lowest price
print("10 Highest prices: ")
print(stocks.nlargest(10, "PRECO")[["TICKER", "PRECO"]])

print("\n\n10 Lowest prices: ")
print(stocks.nsmallest(10, "PRECO")[["TICKER", "PRECO"]])

In [None]:
#@title 5.3) Sum and average of the Average Daily Liquidity
print(f"Liquidity sum: {round(stocks[' LIQUIDEZ MEDIA DIARIA'].sum() / 1000000, 2)}M")
print(f"Liquidity average: {round(stocks[' LIQUIDEZ MEDIA DIARIA'].mean() / 1000000, 2)}M")

In [None]:
#@title 5.4) Names of assets with P/E greater than 0;
stocks[stocks["P/L"] > 0][["TICKER", "P/L"]]

In [None]:
#@title 5.4) Assets with DY above 0;
stocks[stocks["DY"] > 0][["TICKER", "DY"]].sort_values(by=["DY"], ascending=False)

In [None]:
#@title 5.5) List the PN preferred shares (code 4)
stocks[stocks["TICKER"].str.endswith("4")]

## 6) Sharing results

In [None]:
#title 6.1) Plot 10 stocks with the highest Average Daily Liquidity
import matplotlib.pyplot as plt

highest_averages = stocks.nlargest(10, " LIQUIDEZ MEDIA DIARIA")

# Create a column chart
plt.bar(
    highest_averages["TICKER"], 
    highest_averages[" LIQUIDEZ MEDIA DIARIA"] / 1000000
)

# Set chart title and labels
plt.title("10 Stocks with the highest Average Daily Liquidity")
plt.ylabel("Average daily liquity (Millions R$)")
plt.xlabel("Stocks")

# Plot chart
plt.plot()