# <center> Finanzas Cuantitativas - Clase 14

### <center> Aplicaciones con Python para Finanzas - Por Gabriela Facciano

# Intro Housing Price Regression Walkthrough 
In this workbook, I walk through an analysis of the Housing Price Dataset. There is also an accompanying video on YouTube located here: https://youtu.be/NQQ3DRdXAXE

I touch on a few things in the **notebook**:
1. Basic data cleaning and feature exploration
2. Exploratory data analysis (Answering questions we have of the data)
3. Basic Data Engineering (Creating a pipeline for tain and test sets)
4. Model Experimentation and parameter tuning (Linear Regression, Random Forest, XGBoost, MLP)
5. Feature Engineering 
6. Ensembling 
7. Submitting to the Competition

Things I touch on in the **video**:
1. How to approach a problem like this 
2. How I would consider using AI tools like ChatGPT to solve a problem like this 
3. Why I made certain design decisions and the choices we have we we do open ended projects like these
4. How you can continue and improve upon this analysis 

I have done something similar in the past with the **Titanic Dataset** if you want something slighty more beginner friendly:

- Kaggle notebook: https://www.kaggle.com/code/kenjee/titanic-project-example/notebook
- YouTube Video: https://www.youtube.com/watch?v=I3FBJdiExcg&ab_channel=KenJee

My github repos with additional free and paid resources: 
- ML Process: https://github.com/PlayingNumbers/ML_Process_Course
- ML ALgorithms: https://github.com/PlayingNumbers/ML_Algorithms_Course

# Historical prices download


In this notebook we will be downloading the historical series of a list of stocks.

1. Byma´s prices downloaded from Yahoo Finance. 


In order to get the desired results, the next steps must be followed:

1. Open the Excel file named 'tickers.xlsx' located in the same folder of this program: 
	* Complete the `'ticker_byma'` column.
	* Complete the `'ticker_yahoo'` column. 
2. Set the `'start_date'` variable in the section 1 of this program.
3. Set the `'NOMBRE_OUTPUT'` variable in the section 1 of this program. Data series will be saved and named by the value set in this variable.


The next steps will be followed in order to implement the ***Project***:

1. Kick-off: Libraries Importing, Variables Setup and Functions.

2. Data Loading

3. Data Cleaning

4. Data Transformation

5. Results saving

## Visualización de Datos - Aplicación práctica

Ver HTML con teoría.

### Instalamos librerias

Podemos hacerlo desde este notebook o desde un cmd. Desde un cmd, es mas seguro que no falle.

In [11]:
#! pip install matplotlib
#! pip install seaborn

In [12]:
import pandas as pd
from IPython.display import display, HTML, Javascript

In [13]:
df = pd.DataFrame({
	f"Column {i}": range(100) for i in range(25)
})


In [14]:
df

Unnamed: 0,Column 0,Column 1,Column 2,Column 3,Column 4,Column 5,Column 6,Column 7,Column 8,Column 9,...,Column 15,Column 16,Column 17,Column 18,Column 19,Column 20,Column 21,Column 22,Column 23,Column 24
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2,2,2,2,2,2,2,2,2,2,2,...,2,2,2,2,2,2,2,2,2,2
3,3,3,3,3,3,3,3,3,3,3,...,3,3,3,3,3,3,3,3,3,3
4,4,4,4,4,4,4,4,4,4,4,...,4,4,4,4,4,4,4,4,4,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,95,95,95,95,95,95,95,95,95,95,...,95,95,95,95,95,95,95,95,95,95
96,96,96,96,96,96,96,96,96,96,96,...,96,96,96,96,96,96,96,96,96,96
97,97,97,97,97,97,97,97,97,97,97,...,97,97,97,97,97,97,97,97,97,97
98,98,98,98,98,98,98,98,98,98,98,...,98,98,98,98,98,98,98,98,98,98


In [15]:
# Saving the original Pandas method
_original_repr_html_ = pd.DataFrame._repr_html_


def show_df(content, width='99%', height='380px'):
	"""
	Displays a DataFrame or HTML in a scrollable container.
	"""
	num_rows, num_cols = content.shape if isinstance(content, pd.DataFrame) else (0, 0)
	content_html = content.to_html() if isinstance(content, pd.DataFrame) else str(content)
	
	styles = f"""
	<style>
		.scrollable-table-container {{ width: {width}; height: {height}; overflow-y: auto; border: 1px solid #ccc; padding: 8px; }} 
		table {{ width: 100%; border-collapse: collapse; text-align: left; }} 
		th, td {{ border: 1px solid #ddd; padding: 4.5px; height: 15px; vertical-align: middle; }} 
		th {{ position: sticky; top: 0; background-color: #8C8C8C; z-index: 2; }}
		th:first-child {{ position: sticky; left: 0; z-index: 1; }}
		.summary {{ margin-top: 7px; font-size: 13px; color: #D4D4D4; }}
	</style>
	"""
	
	html_content = f"""
	<div class="scrollable-table-container">{content_html}</div>
	<div class="summary">Totals = {num_rows} rows x {num_cols} columns</div>
	"""
	
	display(HTML(styles + html_content))


def auto_show_df(cls):
	"""
	Decorator overriding the _repr_html_ Pandas method to use the show_df function.
	"""
	def custom_repr(self):
		show_df(self)
		return ''
	
	cls._repr_html_ = custom_repr
	return cls


pd.DataFrame = auto_show_df(pd.DataFrame)


# Function to revert to the original behavior
def undo_show_df():
	""" 
	The original pandas format is written again without restarting the kernel, only by calling the function.
	"""
	pd.DataFrame._repr_html_ = _original_repr_html_

	display(Javascript("""
        const styleElements = document.querySelectorAll('style');
        styleElements.forEach(el => {
            if (el.innerText.includes('.scrollable-table-container')) {
                el.remove();
            }
        });
    """))
	


In [16]:
df

Unnamed: 0,Column 0,Column 1,Column 2,Column 3,Column 4,Column 5,Column 6,Column 7,Column 8,Column 9,Column 10,Column 11,Column 12,Column 13,Column 14,Column 15,Column 16,Column 17,Column 18,Column 19,Column 20,Column 21,Column 22,Column 23,Column 24
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3
4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4
5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7
8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8
9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9


In [17]:
undo_show_df()

<IPython.core.display.Javascript object>

In [18]:
df

Unnamed: 0,Column 0,Column 1,Column 2,Column 3,Column 4,Column 5,Column 6,Column 7,Column 8,Column 9,...,Column 15,Column 16,Column 17,Column 18,Column 19,Column 20,Column 21,Column 22,Column 23,Column 24
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2,2,2,2,2,2,2,2,2,2,2,...,2,2,2,2,2,2,2,2,2,2
3,3,3,3,3,3,3,3,3,3,3,...,3,3,3,3,3,3,3,3,3,3
4,4,4,4,4,4,4,4,4,4,4,...,4,4,4,4,4,4,4,4,4,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,95,95,95,95,95,95,95,95,95,95,...,95,95,95,95,95,95,95,95,95,95
96,96,96,96,96,96,96,96,96,96,96,...,96,96,96,96,96,96,96,96,96,96
97,97,97,97,97,97,97,97,97,97,97,...,97,97,97,97,97,97,97,97,97,97
98,98,98,98,98,98,98,98,98,98,98,...,98,98,98,98,98,98,98,98,98,98


In [19]:
print(pd.DataFrame)

<class 'pandas.core.frame.DataFrame'>
