<a href="https://colab.research.google.com/github/docmhvr/Deep_Learning_with_Pytorch/blob/main/Deep_Learning_on_Auto_MPG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Deep Learning with Auto0MPG Dataset**
**Dataset :** Auto MPG [link](https://archive.ics.uci.edu/dataset/9/auto+mpg)

**Reference :** Quinlan,R.. (1993). Auto MPG. UCI Machine Learning Repository. https://doi.org/10.24432/C5859H.

"*The data concerns city-cycle fuel consumption in miles per gallon, to be predicted in terms of 3 multivalued discrete and 5 continuous attributes.*" (Quinlan, 1993)

### Data Loading

In [1]:
!pip install ucimlrepo

Collecting ucimlrepo
  Downloading ucimlrepo-0.0.7-py3-none-any.whl.metadata (5.5 kB)
Downloading ucimlrepo-0.0.7-py3-none-any.whl (8.0 kB)
Installing collected packages: ucimlrepo
Successfully installed ucimlrepo-0.0.7


In [19]:
from ucimlrepo import fetch_ucirepo

# fetch dataset
auto_mpg = fetch_ucirepo(id=9)

# data (as pandas dataframes)
X = auto_mpg.data.features
y = auto_mpg.data.targets

# metadata
print(auto_mpg.metadata)

# variable information
print(auto_mpg.variables)


{'uci_id': 9, 'name': 'Auto MPG', 'repository_url': 'https://archive.ics.uci.edu/dataset/9/auto+mpg', 'data_url': 'https://archive.ics.uci.edu/static/public/9/data.csv', 'abstract': 'Revised from CMU StatLib library, data concerns city-cycle fuel consumption', 'area': 'Other', 'tasks': ['Regression'], 'characteristics': ['Multivariate'], 'num_instances': 398, 'num_features': 7, 'feature_types': ['Real', 'Categorical', 'Integer'], 'demographics': [], 'target_col': ['mpg'], 'index_col': ['car_name'], 'has_missing_values': 'yes', 'missing_values_symbol': 'NaN', 'year_of_dataset_creation': 1993, 'last_updated': 'Thu Aug 10 2023', 'dataset_doi': '10.24432/C5859H', 'creators': ['R. Quinlan'], 'intro_paper': None, 'additional_info': {'summary': 'This dataset is a slightly modified version of the dataset provided in the StatLib library.  In line with the use by Ross Quinlan (1993) in predicting the attribute "mpg", 8 of the original instances were removed because they had unknown values for th

### Data Processing

In [20]:
import pandas as pd

In [21]:
dataset= pd.concat([X, y], axis=1)

In [25]:
dataset.head()
dataset.describe()

Unnamed: 0,displacement,cylinders,horsepower,weight,acceleration,model_year,origin,mpg
count,398.0,398.0,392.0,398.0,398.0,398.0,398.0,398.0
mean,193.425879,5.454774,104.469388,2970.424623,15.56809,76.01005,1.572864,23.514573
std,104.269838,1.701004,38.49116,846.841774,2.757689,3.697627,0.802055,7.815984
min,68.0,3.0,46.0,1613.0,8.0,70.0,1.0,9.0
25%,104.25,4.0,75.0,2223.75,13.825,73.0,1.0,17.5
50%,148.5,4.0,93.5,2803.5,15.5,76.0,1.0,23.0
75%,262.0,8.0,126.0,3608.0,17.175,79.0,2.0,29.0
max,455.0,8.0,230.0,5140.0,24.8,82.0,3.0,46.6


In [26]:
dataset.isna().sum()

Unnamed: 0,0
displacement,0
cylinders,0
horsepower,6
weight,0
acceleration,0
model_year,0
origin,0
mpg,0


In [27]:
dataset = dataset.dropna()

In [35]:
dataset.shape
dataset.tail()
dataset.value_counts()
dataset.isna().sum()

Unnamed: 0,0
displacement,0
cylinders,0
horsepower,0
weight,0
acceleration,0
model_year,0
origin,0
mpg,0


### Split into Train and Test data

In [36]:
train_dataset = dataset.sample(frac=0.8, random_state=42)
test_dataset  = dataset.drop(train_dataset.index)

print(train_dataset.shape)
print(test_dataset.shape)

(314, 8)
(78, 8)


### Split Features and Target

In [38]:
X_train = train_dataset.copy()
X_test  = test_dataset.copy()

# Separate target values from features.
y_train = X_train.pop('mpg')
y_test  = X_test.pop('mpg')

### Pytorch Training Linear Neural Network
![My Image](https://github.com/docmhvr/Deep_Learning_with_Pytorch/blob/main/Linear_Regression_Training_workflow.jpg)