***Ejercicio DEBER: Linear Regression with TensorFlow***
### ***AUTOR: Isaac Reyes***

### ***7 Linear Regression with TensorFlow using the California Housing Dataset***

**The goal of this exercise is to implement a linear regression model using TensorFlow to predict house prices based on the California Housing Dataset. The dataset contains various features such as average income, housing average age, and more. Your task is to build a linear regression model and evaluate its performance.**

*Import the required libraries:*

In [30]:
import tensorflow as tf
import pandas as pd
from sklearn.datasets import fetch_california_housing

Load the California Housing Dataset

In [31]:
raw = fetch_california_housing()
X = pd.DataFrame(data=raw['data'], columns=raw['feature_names'])
y = pd.Series(raw['target'])

In [32]:
print(X,y)

       MedInc  HouseAge  AveRooms  AveBedrms  Population  AveOccup  Latitude  \
0      8.3252      41.0  6.984127   1.023810       322.0  2.555556     37.88   
1      8.3014      21.0  6.238137   0.971880      2401.0  2.109842     37.86   
2      7.2574      52.0  8.288136   1.073446       496.0  2.802260     37.85   
3      5.6431      52.0  5.817352   1.073059       558.0  2.547945     37.85   
4      3.8462      52.0  6.281853   1.081081       565.0  2.181467     37.85   
...       ...       ...       ...        ...         ...       ...       ...   
20635  1.5603      25.0  5.045455   1.133333       845.0  2.560606     39.48   
20636  2.5568      18.0  6.114035   1.315789       356.0  3.122807     39.49   
20637  1.7000      17.0  5.205543   1.120092      1007.0  2.325635     39.43   
20638  1.8672      18.0  5.329513   1.171920       741.0  2.123209     39.43   
20639  2.3886      16.0  5.254717   1.162264      1387.0  2.616981     39.37   

       Longitude  
0        -122.23  
1

**Preprocess the data**

-Normalize the features using the mean and standard deviation.

-Split the dataset into training and testing sets (e.g., 80% for training, 20% for testing).

In [34]:
#mas librerias:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error

In [35]:
# Normaliza las características utilizando la media y la desviación estándar
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Divide el dataset en conjuntos de entrenamiento y prueba (80% para entrenamiento, 20% para prueba)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)


**Define the TensorFlow graph:**

-Create placeholders for the input features (X) and target variable (y).

-Create variables for the model's weights (W) and bias (b).

-Define the linear regression model using the equation: y_pred = X * W + b.

-Define the loss function as the mean squared error between the predicted values and the true values.

-Choose an optimizer (e.g., Gradient Descent) to minimize the loss function.

In [36]:
#ERRO PARA TENSORFLOW 2
# Crear un modelo Sequential
#model = Sequential()
#CORREGIR
#VERSION DE TENSORFLOW 2, PLACEHOLDER NO SOPORTA 2

#CORRECCION TENSORFLOW2
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(1, input_dim=X_train.shape[1]))

# para tensorflow 1:Añadir una capa Dense con regularización L2
# La entrada es el número de características en el conjunto de datos

#TensorFlow2: Compila el modelo con el optimizador SGD y la función de pérdida MSE:
model.compile(optimizer='sgd', loss='mean_squared_error')

#Para el TF1: Compilar el modelo con el optimizador SGD y la función de pérdida MSE
#W = tf.Variable(tf.random_normal_initializer([X_train.shape[1],1]))
#b = tf.Variable(tf.random_normal_initializer(1))

### TRAIN THE MODEL

-Initialize TensorFlow session.

-Initialize the model's variables.

-Set the number of training epochs and the learning rate.

-For each epoch, iterate through the training dataset and update the model's parameters using the optimizer.

-Print the training loss at regular intervals.

In [37]:
# Definir el optimizador y la función de pérdida
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='mean_squared_error')


# Entrenar el modelo
model.fit(X_train, y_train, epochs=50, batch_size=32)


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x214089605d0>

# Evaluate the model:

-Use the trained model to make predictions on the test dataset.

-Calculate the mean squared error (MSE) between the predicted and true values.

-Print the MSE as a measure of the model's performance.

In [39]:
# Usa el modelo entrenado para hacer predicciones en el conjunto de prueba
y_pred = model.predict(X_test)



In [40]:
# Calcula el error cuadrático medio (MSE) entre los valores predichos y los verdaderos
mse = tf.keras.losses.MSE(y_test, y_pred)
print(mse)
print('MSE: ', mse.numpy())

tf.Tensor([3.11988   1.398612  1.6972543 ... 7.2466793 2.0868456 1.3117647], shape=(4128,), dtype=float32)
MSE:  [3.11988   1.398612  1.6972543 ... 7.2466793 2.0868456 1.3117647]


In [41]:
# Calcular el MSE entre las predicciones y las verdaderas etiquetas
mse = tf.keras.losses.MSE(y_test, y_pred)
print('MSE on test data:', tf.reduce_mean(mse).numpy())

MSE on test data: 2.1642709
