# PREGUNTA 7

## Configuración de las Rutas de Importación

Se añade el directorio padre (..) al path (sys.path), lo que permite al entorno de Python acceder a módulos o paquetes ubicados en directorios superiores al actual. Esto es útil para poder importar scripts o paquetes personalizados sin tener que mover ficheros o el directorio de trabajo.

In [1]:
import sys
sys.path.insert(0, '..')

## Verificación de las Versiones de los Paquetes

Se utiliza la función check_packages() para verificar que los paquetes y sus respectivas versiones indicadas en el diccionario 'd' estén instalados correctamente dentro del entorno. Este paso es importante para verificar la compatibilidad de cada paquete para poder evitar errores por diferencia de versión.

In [2]:
from python_environment_check import check_packages
d = {
    'numpy': '1.21.2',
    'scipy': '1.7.0',
    'mlxtend' : '0.19.0',
    'matplotlib': '3.4.3',
    'sklearn': '1.0',
    'pandas': '1.3.2'
}
check_packages(d)

[OK] Your Python version is 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
[OK] numpy 1.24.3
[OK] scipy 1.8.0
[OK] mlxtend 0.23.1
[OK] matplotlib 3.5.1
[OK] sklearn 1.5.2
[OK] pandas 2.2.2


## Importación de Paquetes

Se importan los paquetes esenciales para analizar y visualizar datos: numpy para cálculos numéricos, pandas para manipular datos y matplotlib.pyplot para visualizar gráficos, entre otros.

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

---

In [4]:
columns = ['Col1', 'Col2', 'Col3', 'Col4', 'Col5', 'Col6', 'Col7', 'Col8', 'Target']
df = pd.read_csv("dataset_regression.csv", 
                 sep=',',
                 usecols=columns)

In [5]:
X = df[['Col1', 'Col2', 'Col3', 'Col4', 'Col5', 'Col6', 'Col7', 'Col8']].values
y = df['Target'].values

from sklearn.model_selection import train_test_split

# PARÁMETROS POR DEFECTO
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.30, random_state=123)

# PARÁMETROS ÓPTIMOS
# X_train, X_test, y_train, y_test = train_test_split(
#     X, y, test_size=0.30, random_state=1)

In [6]:
regr = LinearRegression()

cubic = PolynomialFeatures(degree=3)
X_train_cubic = cubic.fit_transform(X_train)

regr_cubic = regr.fit(X_train_cubic, y_train)

print("Cubic Model Coefficients:", regr_cubic.coef_)
print("Cubic Model Intercept:", regr_cubic.intercept_)

new_data_cubic = np.array([[153.0, 8.0, 194.0, 28, 192.0, 623.0, 935.0, 149.0]])

transformed_new_data_cubic = cubic.transform(new_data_cubic)
print("Cubic Transformed Data:", transformed_new_data_cubic[0])

predicted_target_cubic = regr_cubic.predict(transformed_new_data_cubic)
print("Predicted Target:", predicted_target_cubic)

Cubic Model Coefficients: [-6.29648500e-03 -1.17092636e+02  1.20738183e+03 -3.84536965e+02
 -3.30475259e+01 -4.66159165e+02 -2.00086786e+02 -1.49638679e+02
 -1.56922435e+02  3.21233898e-02 -8.51196952e-01  2.36854344e-01
  5.85881000e-02  1.83756425e-01  1.54232961e-01  4.17664547e-02
  1.07812352e-01  6.83483223e-02 -1.63687976e-01  1.12621882e+00
 -1.21362757e+00 -1.49683504e+00 -7.78384388e-01 -8.36070915e-01
  2.17254904e-01 -6.30994986e-02  6.97307977e-01  3.87528594e-01
  2.41303393e-01  2.56914835e-01 -1.81661901e-03  3.10095164e-02
  5.21017270e-02  8.64487030e-03  3.18023173e-02  5.57219643e-01
  3.90960135e-01  3.22890810e-01  2.67743369e-01  8.85196719e-02
  1.29879196e-01  1.58403913e-01  5.16493481e-02  9.85493642e-02
  5.41000440e-02 -3.63917838e-06  2.19833513e-04 -4.20176052e-05
 -1.68433263e-05  6.14316843e-06 -3.17569413e-05 -1.50087588e-06
 -2.11750924e-05  1.54694741e-04  7.10041324e-05 -4.74847060e-04
  3.82906269e-04  5.01083883e-04  2.91089557e-04  3.14326406e-04

In [7]:
X_train_cubic = cubic.fit_transform(X_train)
X_test_cubic = cubic.fit_transform(X_test)

y_train_cubic = regr.predict(X_train_cubic)
y_test_cubic = regr.predict(X_test_cubic)

In [8]:
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import r2_score

mae_train = mean_absolute_error(y_train, y_train_cubic)
mae_test = mean_absolute_error(y_test, y_test_cubic)

mse_train = mean_squared_error(y_train, y_train_cubic)
mse_test = mean_squared_error(y_test, y_test_cubic)

r2_train = r2_score(y_train, y_train_cubic)
r2_test = r2_score(y_test, y_test_cubic)

print(f'MSE train: {mse_train:.2f}')
print(f'MSE test: {mse_test:.2f}')

print(f'MAE train: {mae_train:.2f}')
print(f'MAE test: {mae_test:.2f}')

print(f'R² train: {r2_train:.2f}')
print(f'R² test: {r2_test:.2f}')

MSE train: 18.52
MSE test: 38.72
MAE train: 3.38
MAE test: 4.54
R² train: 0.93
R² test: 0.86


---

## Convertir Jupyter Notebook a Fichero Python

### Script en el Directorio Actual

In [9]:
! python .convert_notebook_to_script.py --input answer7.ipynb --output answer7.py

[NbConvertApp] Converting notebook answer7.ipynb to script
[NbConvertApp] Writing 3913 bytes to answer7.py


### Script en el Directorio Padre

In [10]:
! python ../.convert_notebook_to_script.py --input answer7.ipynb --output answer7.py

[NbConvertApp] Converting notebook answer7.ipynb to script
[NbConvertApp] Writing 3913 bytes to answer7.py
