# PREGUNTA 6

## Configuración de las Rutas de Importación

Se añade el directorio padre (..) al path (sys.path), lo que permite al entorno de Python acceder a módulos o paquetes ubicados en directorios superiores al actual. Esto es útil para poder importar scripts o paquetes personalizados sin tener que mover ficheros o el directorio de trabajo.

In [1]:
import sys
sys.path.insert(0, '..')

## Verificación de las Versiones de los Paquetes

Se utiliza la función check_packages() para verificar que los paquetes y sus respectivas versiones indicadas en el diccionario 'd' estén instalados correctamente dentro del entorno. Este paso es importante para verificar la compatibilidad de cada paquete para poder evitar errores por diferencia de versión.

In [2]:
from python_environment_check import check_packages
d = {
    'numpy': '1.21.2',
    'scipy': '1.7.0',
    'mlxtend' : '0.19.0',
    'matplotlib': '3.4.3',
    'sklearn': '1.0',
    'pandas': '1.3.2'
}
check_packages(d)

[OK] Your Python version is 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
[OK] numpy 1.24.3
[OK] scipy 1.8.0
[OK] mlxtend 0.23.1
[OK] matplotlib 3.5.1
[OK] sklearn 1.5.2
[OK] pandas 2.2.2


## Importación de Paquetes

Se importan los paquetes esenciales para analizar y visualizar datos: numpy para cálculos numéricos, pandas para manipular datos y matplotlib.pyplot para visualizar gráficos, entre otros.

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

---

In [4]:
columns = ['Col1', 'Col2', 'Col3', 'Col4', 'Col5', 'Col6', 'Col7', 'Col8', 'Target']
df = pd.read_csv("dataset_regression.csv", 
                 sep=',',
                 usecols=columns)

In [5]:
X = df[['Col1', 'Col2', 'Col3', 'Col4', 'Col5', 'Col6', 'Col7', 'Col8']].values
y = df['Target'].values

from sklearn.model_selection import train_test_split

# PARÁMETROS POR DEFECTO
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.30, random_state=123)

# PARÁMETROS ÓPTIMOS
# X_train, X_test, y_train, y_test = train_test_split(
#     X, y, test_size=0.30, random_state=1)

In [6]:
regr = LinearRegression()

quadratic = PolynomialFeatures(degree=2)
X_train_quadratic = quadratic.fit_transform(X_train)

regr_quadratic = regr.fit(X_train_quadratic, y_train)

print("Quadratic Model Coefficients:", regr_quadratic.coef_)
print("Quadratic Model Intercept:", regr_quadratic.intercept_)

new_data_quadratic = np.array([[207.0, 5.0, 161.0, 28, 179.0, 736.0, 867.0, 132.0]])

transformed_new_data_quadratic = quadratic.transform(new_data_quadratic)
print("Quadratic Transformed Data:", transformed_new_data_quadratic[0])

predicted_target_quadratic = regr_quadratic.predict(transformed_new_data_quadratic)
print("Predicted Target:", predicted_target_quadratic)

Quadratic Model Coefficients: [-6.25470471e-11  2.71787555e+00  4.22714493e+01  2.25824438e+00
  7.95429220e-03  1.72799205e+01  4.34315102e+00  3.38671588e+00
  3.05336910e+00 -3.21099031e-04 -1.31996534e-02 -3.19647039e-05
  4.91112968e-04 -4.94193695e-03 -9.00028592e-04 -7.64128613e-04
 -5.81616973e-04 -6.35237634e-02 -2.13208471e-02  3.17232210e-03
 -4.85837843e-02 -1.58751396e-02 -1.45357402e-02 -1.34156793e-02
  4.90927798e-05  9.00727037e-04 -5.65183946e-03 -6.50971010e-04
 -4.94296441e-04 -3.07614472e-04 -5.95238139e-04 -4.01888335e-05
  3.54850834e-04 -1.12719269e-04  2.89366891e-04 -1.08044338e-02
 -6.59061366e-03 -6.10179577e-03 -5.39065119e-03 -9.37891789e-04
 -1.22385694e-03 -1.09568948e-03 -4.80541277e-04 -7.80579438e-04
 -3.83803150e-04]
Quadratic Model Intercept: -5595.264874799473
Quadratic Transformed Data: [1.00000e+00 2.07000e+02 5.00000e+00 1.61000e+02 2.80000e+01 1.79000e+02
 7.36000e+02 8.67000e+02 1.32000e+02 4.28490e+04 1.03500e+03 3.33270e+04
 5.79600e+03 3.70

In [7]:
X_train_quadratic = quadratic.fit_transform(X_train)
X_test_quadratic = quadratic.fit_transform(X_test)

y_train_quadratic = regr.predict(X_train_quadratic)
y_test_quadratic = regr.predict(X_test_quadratic)

In [8]:
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import r2_score

mae_train = mean_absolute_error(y_train, y_train_quadratic)
mae_test = mean_absolute_error(y_test, y_test_quadratic)

mse_train = mean_squared_error(y_train, y_train_quadratic)
mse_test = mean_squared_error(y_test, y_test_quadratic)

r2_train = r2_score(y_train, y_train_quadratic)
r2_test = r2_score(y_test, y_test_quadratic)

print(f'MSE train: {mse_train:.2f}')
print(f'MSE test: {mse_test:.2f}')

print(f'MAE train: {mae_train:.2f}')
print(f'MAE test: {mae_test:.2f}')

print(f'R² train: {r2_train:.2f}')
print(f'R² test: {r2_test:.2f}')

MSE train: 51.94
MSE test: 57.41
MAE train: 5.61
MAE test: 5.79
R² train: 0.82
R² test: 0.79


---

## Convertir Jupyter Notebook a Fichero Python

### Script en el Directorio Actual

In [9]:
! python .convert_notebook_to_script.py --input answer6.ipynb --output answer6.py

[NbConvertApp] Converting notebook answer6.ipynb to script
[NbConvertApp] Writing 4045 bytes to answer6.py


### Script en el Directorio Padre

In [10]:
! python ../.convert_notebook_to_script.py --input answer6.ipynb --output answer6.py

[NbConvertApp] Converting notebook answer6.ipynb to script
[NbConvertApp] Writing 4045 bytes to answer6.py
