### **Step 1: Install Required Libraries**

First, let's install the necessary libraries for this project:

In [None]:
!pip install pandas
!pip install scikit-learn
!pip install micromlgen
!git clone git@github.com:BaseMax/C-Minifier.git
!gcc C-Minifier/Minifier.c -o minifier -lm


We'll be using:
- Pandas for storing data in dataframes
- Scikit-learn for model training
- Micromlgen to port the model into a C++ header file
- C-Minifier for compressing C files to assess model size


### **Step 2: Import Libraries**

Next, let's import the libraries we just installed so that our code blocks have access to their functions:



In [None]:
import re
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from micromlgen import port

### **Step 3: Prepare Data**

Now, let's read our CSV data file into a dataframe and split it into training and testing datasets.  
We'll go for an 80%, 20% split for training and testing, but feel free to adjust this as needed:



In [None]:
csvFile = "compassAndHeadingData.csv"
df = pd.read_csv(csvFile)
X = df[['X', 'Y', 'Z']]
y = df['Heading']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

### **Step 4: Train the Model**

We'll now train our model on the data we've split. 
We've opted for a Decision Tree model, but you can experiment with other models and parameters.

Supported models for porting with micromlgen include:
- DecisionTree
- RandomForest
- XGBoost
- GaussianNB
- Support Vector Machines (SVC and OneClassSVM)
- Relevant Vector Machines (from skbayes.rvm_ard_models package)
- SEFR
- PCA

In [None]:
model = DecisionTreeRegressor(max_depth=12, min_samples_leaf=20)
model.fit(X_train, y_train)


### **Step 5: Evaluate the Model**

Let's evaluate the model's performance.  
A score closer to 1 indicates better performance on the test data:



In [None]:
score = model.score(X_test, y_test)
print("Model Score:", score)


### **Step 6: Port the Trained Model to C++**

Finally, we'll write the model to a C++ header file and use the minifier from earlier to evaluate it's size.  
We aim for a model size under 8-10 kilobytes to fit within the Arduino's limited memory:



In [None]:
with open('../QMC5883LCompass/model.h', 'w') as file:
    file.write(port(model))
!./minifier ../QMC5883LCompass/model.h ../QMC5883LCompass/model.h
with open('../QMC5883LCompass/model.h', 'r') as file:
    content = file.read()
    match = re.search(r'class\s+(\w+)', content)
    if match:
        className = match.group(1)
        content = content.replace(className, 'Model')
    else:
        print("you'll need to manually change the class name in the model file to 'Model' for the library to work")
    file.seek(0, 2)
    size = file.tell()/1000
    print("The model is {} kilobytes".format(size))

with open('../QMC5883LCompass/model.h', 'w') as file:
    file.write(content)