# Using shallow learning algorithms from ManufacturingNet
##### To know more about ManufacturingNet please visit: http://manufacturingnet.io/

In [2]:
import ManufacturingNet
import numpy as np

First we import manufacturingnet. We can use this to experiment with several shallow learning models.

It is important to note that all the dependencies of the package must also be installed in your environment. 

##### Now the dataset first needs to be downloaded. The dataset class can be used where different types of datasets have been curated and only two lines of code need to be run to download the data.

In [3]:
from ManufacturingNet import datasets

In [4]:
datasets.ThreeDPrintingData()

Downloading...
From: https://drive.google.com/uc?id=1VhZcOgNOEw_Sciuww25XZdIuaqO90Nkj
To: /home/cmu/ManufacturingNet/tutorials/ThreeDPrintingData.zip
100%|██████████| 928/928 [00:00<00:00, 2.50MB/s]


Alright! Now the dataset desired should be downloaded and present in the working directory.

The 3D Printing dataset consists of several continuous and discrete parameters. We can perform classification or regression depending on what the desired output attribute is. We perform classification by predicting the material used based on the input and measured parameters. We can then perform regression on possibly a different attribute in the data.


### Loading the dataset
Here, we can use the pandas library to read and import the data, since there are categorial attributes. If pandas is not installed in your environment, here is a useful reference : https://pandas.pydata.org/docs/getting_started/index.html

In [5]:
import pandas as pd

In [6]:
data = pd.read_csv("3D_printing_dataset/data.csv", sep = ",")

We then discretize the categorical attributes - infill pattern and material. 

In [8]:
data.material = [0 if each == "abs" else 1 for each in data.material]
# abs = 0, pla = 1

data.infill_pattern = [0 if each == "grid" else 1 for each in data.infill_pattern]
# grid = 0, honeycomb = 1

### Classification

For classification, we need the input data and an output variable to be predicted. 
We then separate our x and y values from the pandas dataframe. The value we want to predict is the "material", and our input data will be all the columns except "material".

In [9]:
y_data = data.material.values
x_data = data.drop(["material"],axis=1).values


We first get a birds-eye view of how the data can perform with some default classifiers. The metrics we use to measure the performance of these classifiers with some default values are Accuracy, 5-Fold cross validation, and the time. 

This will allow users to get a glance of how possible classifiers can perform on their data.

In [10]:
from ManufacturingNet.models import AllClassificationModels

In [None]:
all_models = AllClassificationModels(x_data, y_data)
all_models.run()

If the user wants to modify a particular classifier more specifically, they are free to choose the classifier they want and pass the data to that.

The user can either choose to persist with the default parameters displayed or can customize the parameters according to their requirements.

In [None]:
from ManufacturingNet.models import RandomForest

rf_model = RandomForest(x_data, y_data)

rf_model.run_classifier()

### Regression

For regression, we need the input data and an output value to be obtained. 
We then separate our x and y values from the pandas dataframe. In this example, the value we want to output is the "roughness", and our input data will be all the columns except "roughness".

In [13]:
y_data_lin = data.roughness.values
x_data_lin = data.drop(["roughness"],axis=1).values

We first get a birds-eye view of how the data can perform with some default regression models. The metrics we use to measure the performance of these regression models with some default parameters are R-2 score and the time taken to run the algorithm. 

This will allow users to get a glance of how possible regression models can perform on their data.

In [None]:
from ManufacturingNet.models import AllRegressionModels

models_reg = AllRegressionModels(x_data_lin, y_data_lin)
models_reg.run()

If the user wants to modify a particular regression model more specifically, they are free to choose the model they want and pass the data to that.

The user can either choose to persist with the default parameters displayed or can customize the parameters according to their requirements.

In [None]:
from ManufacturingNet.models import LinRegression as LinReg


model_lin = LinReg(x_data_lin, y_data_lin)
model_lin.run()

print("MSE:", model_lin.get_mean_squared_error())
print("R2:", model_lin.get_r2_score())
print("R:", model_lin.get_r_score())

This is how we can use ManufacturingNet to accomplish classification and regression tasks. 
We can first obtain a birds-eye view of the performance of all the models that can be used with our data. If we want to modify  a particular model specifically for our data, we can customize the parameters for the model of our choice.