# TensorFlow Linear Regression

### This implementation uses TensorFlow/Keras to build a linear regression model

### Goal: Predict used car prices and compare with No-Framework, Scikit-Learn, and PyTorch

What TensorFlow/Keras provides (that we built manually in No-Framework):
- `tf.keras.Sequential`: High-level API for building models layer by layer
- `tf.keras.layers.Dense`: Fully connected layer (replaces manual weights + bias)
- `tf.keras.losses.MeanSquaredError`: Pre-built loss function
- `tf.keras.optimizers.SGD`: Optimizer that handles parameter updates
- `model.fit()`: Complete training loop in one line

Key Concept - Keras vs Raw TensorFlow:
- TensorFlow 2.x uses Keras as its high-level API
- Keras abstracts away the computational graph complexity
- Similar to PyTorch's nn.Module, but with even simpler syntax via Sequential API


In [1]:
# tensorflow: The main TensorFlow Library
import tensorflow as tf

# numpy: Still needed for initial data handling
import numpy as np

# pandas: For loading CSV data
import pandas as pd

# matplotlib: for visualizations
import matplotlib.pyplot as plt

# os: File path handling
import os

# Sklearn utilities: Using these for consistency with previous implementations
# This ensures identical train/test splits and scaling
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Performance tracking
import time
import tracemalloc
import platform

# Set random see for reproducibility
RANDOM_SEED = 113
np.random.seed(RANDOM_SEED)
tf.random.set_seed(RANDOM_SEED)

print("All Imports successful!")
print(f"TensorFlow version: {tf.__version__}")
print(f"Random seed set to: {RANDOM_SEED}")

All Imports successful!
TensorFlow version: 2.20.0
Random seed set to: 113


# Load Cleaned Data

- Load the same pre-processed dataset used in N0-Framework, Scikit-Learn, and PyTorch
- Using pandas for consistency with SL implementation
- This ensures fair comparison across all frameworks

In [2]:
# Define path to our cleaned dataset
DATA_PATH = os.path.join('..', '..', 'data', 'processed', 'vehicles_clean.csv')

# Load data using pandas
df = pd.read_csv(DATA_PATH)

# Verify data loaded correctly
print(f"Dataset shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")
print(f"\nFirst 3 rows:")
print(df.head(3))

Dataset shape: (100000, 12)
Columns: ['price', 'year', 'manufacturer', 'condition', 'cylinders', 'fuel', 'odometer', 'title_status', 'transmission', 'drive', 'type', 'state']

First 3 rows:
   price    year  manufacturer  condition  cylinders  fuel  odometer  \
0  29990  2014.0             7          2          6     2   26129.0   
1   6995  2006.0            12          0          6     2  198947.0   
2   4995  2009.0            35          6          8     2  152794.0   

   title_status  transmission  drive  type  state  
0             0             2      0     8     17  
1             6             0      3    10      5  
2             0             0      3    11     22  
