In [None]:
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
from IPython.display import display, HTML
import pandas as pd
import os

from utils.visualisation import load_models_history_graph

In [None]:
pio.templates.default = "plotly_dark"
display(HTML("<style>.container { width:100% !important; }</style>"))

In [None]:
train_df = pd.read_feather(os.path.join(os.getcwd(), 'data', 'train.ftr'))

purple = 'rgb(60, 22, 66)'
sea = 'rgb(8, 99, 117)'
light_green = 'rgb(178, 255, 158)'

# Generally
- I trained two models to different time ranges in data.   
- They crossed 90% accuracy on validation datasets. Hope they work same on test data!

## I've done twom main steps

### 1 - Data recognition
I saw that data can be splitted by 2012y.  
Before, only two labels appear and after this year 3rd label is active.  
**I've decided here to train two models for different time ranges.**

In [None]:
train_df['label_copy'] = train_df['label']
df_graph = (
    train_df
    .groupby(['year', 'label'])
    .agg({'label_copy': 'count'})
    .reset_index()
)
df_graph['label'] = df_graph['label'].astype(str)
colors_dict = {
    '0': purple,
    '1': light_green,
    '2': sea
}

fig = px.bar(
    df_graph,
    x='year',
    y='label_copy',
    color='label',
    color_discrete_map=colors_dict
)
fig.update_layout(
    title='Labels distribution in time<br><b>Deforestation in type 1 appears only after 2012y</b><br>',
    yaxis_title='count',
    height=450,
    width=1100
)
fig.show(renderer='notebook')

### 2 - Image augmentation
I've made synthetic data (~6k) using some augmentation technics (Rotatino, changing brightness, vertical shift).  
It multiplied train dataset with new images, similar to real but giving new information to models.  
**It improves models accuracy by ~25 %-points**

In [None]:
acc_list = [0.65, 0.93]
acc_list_2 = [0.66, .92]
labels = ['Standard images', 'Extra 6k augmented images']

fig = load_models_history_graph(acc_list, labels, 'Before 2012y', sea, acc_list_2, 'After 2012y', light_green, height=350)
fig.show(renderer='notebook')

# To do in future

Obviously, there is some neccessary steps to do, however, I can't code everething in one day :)  
I see two main steps to do:
- error analysis to show and fix weak points of these models
- hyperparameter tuning 


# Main used staff
![Python](https://www.4biosacademy.com.br/files/thumbs/block_1952-python-logo-3-350x350.png?v=1633611877)
![Tensorflow](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQDjasst-lmQ2zB9sMNPQxQAXrvDmDHxxSNLw&usqp=CAU)
![Pandas](https://i.ibb.co/k2pwyrV/Bez-tytu-u.png)
![Plotly](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/Plotly-logo.png/640px-Plotly-logo.png)
![Numpy](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcR_VfYfuw4JGQC0QLtbrhWyAQgW9qD9fXanG34lWGAyI1y34PxtAPagPNkCTAoX7_x7sFw&usqp=CAU)