# Polymer Property Prediction (2/4)

---
In this notebook we will use supervised learning to predict polymer properties.  
When using ML models to predict polymer properties, supervised learning is geneally employed. In supervised learning, models are trained using pre-collected data. This training process relies on labeled data, where each data point is associated with a corresponding outcome. By training on this labeled dataset, a supervised ML model learns to map input data to expected outputs.

## Python Packages

In [1]:
# Basic python packages
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from math import sqrt
import time
from collections import Counter
import pickle
import collections
import seaborn as sns
import random

# RDKit related packaged
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Descriptors
from rdkit.Chem import rdMolDescriptors
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw

# Scikit learn packages
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error
from sklearn.model_selection import train_test_split, cross_val_predict, cross_validate
from sklearn.preprocessing import MinMaxScaler
from sklearn.ensemble import RandomForestRegressor
from sklearn.neural_network import MLPRegressor
from sklearn.svm import SVR
from sklearn.linear_model import Lasso, LassoCV
from sklearn.preprocessing import StandardScaler
from sklearn.utils import shuffle

# Tensorflow packages
from tensorflow.keras.models import Sequential, save_model, load_model
from tensorflow.keras.layers import Dense, Activation, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Conv1D, MaxPooling1D, Dense, Flatten, Activation

from rdkit import rdBase
rdBase.DisableLog('rdApp.warning')

## Setup Machine learning Model

### Establishing train/test split

Before jumping in to model building, we should ask one question: How are we going to know how well our models are performing?  
We can test the model by asking to predict the $T_g$ value of materials, compare it with measured values, and 