# UC Berkeley Milling Dataset
> Reproduces the results from the UC Berkeley Milling Dataset.

Here, we will reproduce the results, figures, and tables from the experiment using the UC Berekely Milling Dataset. You can find dataset on the [NASA Prognostics Repository](https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository).

A detailed description of the dataset is found in the [PyPHM example notebook](https://github.com/tvhahn/PyPHM/blob/master/notebooks/milling_example.ipynb) (which you can also [run on Google Colab](https://colab.research.google.com/github/tvhahn/PyPHM/blob/master/notebooks/milling_example.ipynb)).

First step to reproduce the results is to get the proper environment setup and download the data.

## Table of Contents
* [1. Setup Notebook](#1.-Setup-Notebook) - clone the repo and import require packages
* [2. Explore Data](#2.-Explore-Data) - download the raw data and see how it looks
* [3. Create Train/Val/Test Sets](#3.-Create-Train/Val/Test-Sets) - create the data splits (if you want) and visualize them
* [4. Train Models with a Random Search](#4.-Train-Models-with-a-Random-Search) - train the models
* [5. Summarize Results](#5.-Summarize-Results) - summarize the results of the random search to find the most effective loss functions


# 1. Setup Notebook
**For Google Colab:**
To run the notebook on google colab you must clone the repo an download the data using PyPHM. This can be done by running the following cell.

In [None]:
# ONLY RUN IF YOU'RE USING GOOGLE COLAB
!git clone https://github.com/tvhahn/tspipe

# move into project folder
%cd tspipe

!pip install pyphm
!pip install -e .

**Import Packages:**
Don't skip this step! Needed for Google Colab and if run locally.

In [1]:
from pyphm.datasets.milling import MillingPrepMethodA
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from pathlib import Path

import warnings
warnings.filterwarnings("ignore") # supress all the matplotlib deprecation warnings
%load_ext autoreload
%autoreload 2

In [2]:
# set the root (parent folder) and the data folder locations
path_colab = Path.cwd().parent.parent / 'content'

if path_colab.exists():
    proj_dir = Path.cwd() # get projet folder of repository - use if on colab
else:
    proj_dir = Path.cwd().parent # on local machine in ./notebooks folder

print(proj_dir) 

/home/tim/Documents/feat-store


In [4]:
!python {proj_dir}/src/dataprep/download_data.py -p {proj_dir}

2022-10-24 11:31:51,508 - __main__ - INFO - Download the datasets
Downloading milling data...
