# PART 2: Intermediate Data Processing

In this Jupyter Notebook, we further investigate the interim datasets through a **processing** lens: we analyze, transform, scale, encode, reduce, and otherwise munge our data to prepare it for predictive analysis and machine learning-based modeling. 

- **NOTE**: Before working through this notebook, please ensure that you have all necessary dependencies as denoted in [Section A: Imports and Initializations](#section-A) of this notebook.

- **NOTE**: Before working through Sections A-D of this notebook, please run all code cells in [Appendix A: Supplementary Custom Objects](#appendix-A) to ensure that all relevant functions and objects are appropriately instantiated and ready for use.

---

## 🔵 TABLE OF CONTENTS 🔵 <a name="TOC"></a>

Use this **table of contents** to navigate the various sections of the processing notebook.

#### 1. [Section A: Imports and Initializations](#section-A)

    All necessary imports and object instantiations for data processing.

#### 2. [Section B: Specialized Encoding](#section-B)

    Data encoding operations, including value range mapping, 
    correlational plotting, and categorical encoding.

#### 3. [Section C: Data Scaling & Transformation](#section-C)

    Data transformation techniques, including standard scaling/normalization
    and feature reduction techniques.

#### 4. [Section D: Saving Our Processed Datasets](#section-D)

    Saving processed data states for further access.

#### 5. [Appendix A: Supplementary Custom Objects](#appendix-A)

    Custom Python object architectures used throughout the data processing.

#### 6. [Appendix B: Revised Data Dictionary](#appendix-B)

    Improved data dictionary representation for our dataset.
    
---

## 🔹 Section A: Imports and Initializations <a name="section-A"></a>

General Importations for Data Manipulation and Visualization.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Algorithms for Data Scaling and Feature Reduction.

In [3]:
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

Custom Algorithmic Structures for Processed Data Visualization.

In [4]:
import sys
sys.path.append("../structures/")
# from dataset_processor import Dataset_Processor
from custom_structures import cmat_

#### Instantiate Our Processor Engine

Custom Processor Class for Target-Oriented Data Modification.

**NOTE**: Please refer to _Appendix A: Supplementary Custom Objects_ to view the fully implemented processor object.

In [5]:
# TODO: Create custom dataset processor for in-notebook testing.
# proc = Dataset_Processor()

##### [(back to top)](#TOC)

---