# Type charts

By **Franklin Oliveira**

-----
This notebook contains all code necessary to make the "type" charts from `carcinos` database. Here you'll find some basic data treatment and charts' code. 

Database: <font color='blue'>'Planilha geral Atualizada FINAL 5_GERAL_sendo trabalhada no Google drive.xlsx'</font>

In [1]:
import datetime
import numpy as np
import pandas as pd

from collections import defaultdict

# pacotes para visualização rápida
import seaborn as sns
import matplotlib.pyplot as plt

# pacote para visualização principal
import altair as alt

# habilitando renderizador para notebook
# alt.renderers.enable('notebook')
alt.renderers.enable('default')


# desabilitando limite de linhas
alt.data_transformers.disable_max_rows()

DataTransformerRegistry.enable('default')

## Importing data...

In [3]:
NewTable = pd.read_csv('./data/treated_db.csv', sep=';', encoding='utf-8-sig', low_memory=False)

## Filtering

At least for now, we'll be considering only specimens of order decapoda (deeply revised by the Museum's crew)

In [4]:
decapoda = NewTable[NewTable['order'] == 'Decapoda'].copy()

<br>

<font size=5>**Color palette**</font>

Colors (per infraorder): 

- <font color='#e26d67'><b>Ascacidae</b></font>
- <font color='#007961'><b>Anomura</b></font>
- <font color='#7a2c39'><b>Achelata</b></font>
- <font color='#b67262'><b>Axiidea</b></font>
- <font color='#ee4454'><b>Brachyura</b></font>
- <font color='#3330b7'><b>Caridea</b></font>
- <font color='#58b5e1'><b>Gebiidea</b></font>
- <font color='#b8e450'><b>Stenopodídea</b></font>
- <font color='#a0a3fd'><b>Astacidae</b></font>
- <font color='#deae9e'><b>Polychelida</b></font>
- <font color='#d867be'><b>Grapsidae</b></font>
- <font color='#fece5f'><b>Xanthoidea</b></font>

In [5]:
# importing customized color palettes
from src.MNViz_colors import *

<br>


## Graphs

---

### Types (*per year*) per genus

x: Species1, cor: Type Status1, size: counts

In [7]:
# p.s.: the large majority is non-type
decapoda['type_status'].value_counts()

Parátipo         78
Holótipo         33
Paralectótipo     6
Alótipo           3
Síntipo           2
Lectótipo         2
Topótipo          2
Neótipo           1
Material tipo     1
Name: type_status, dtype: int64

In [31]:
# subsetting
teste = decapoda[['min_depth','family','order', 'start_year', 'qualifier', 'catalog_number', 
                  'genus', 'species', 'type_status']].copy()

# grouping by type, year and order
temp = teste.groupby(['type_status','start_year', 'family']).count()['species'].reset_index().rename(columns={
    'species':'counts'
})

# p.s.: Cótipo and Topótipo are not types
temp = temp[~(temp['type_status'].isin(['Cótipo', 'Topótipo', 'Material tipo', 'Tipo']))]

In [32]:
temp['type_status'].unique()

array(['Alótipo', 'Holótipo', 'Lectótipo', 'Neótipo', 'Paralectótipo',
       'Parátipo', 'Síntipo'], dtype=object)

### Gráf. de Tipos

In [34]:
tipo = alt.Chart(temp, height=150, title='Types per year').mark_circle().encode(
    x = alt.X('start_year:O', title='Sampling Year'),
    y = alt.Y('type_status:N', title= 'Type',
              sort=alt.EncodingSortField('counts', op='sum', order='descending')),
    color= alt.Color('family:N', title='Family',
                     scale=alt.Scale(domain=list(cores_familia_naive.keys()), 
                                     range=list(cores_familia_naive.values())),
                     legend= alt.Legend(columns=6, symbolLimit=102,
                                       direction='horizontal', orient='bottom')), 
    size= alt.Size('counts'),
    order= alt.Order('counts', sort='descending'),  # smaller points in front
    tooltip= [alt.Tooltip('type_status', title='type'),
              alt.Tooltip('start_year', title='start year'),
              alt.Tooltip('counts', title='counts')]
)

tipo = tipo.configure_title(fontSize=16).configure_axis(
    labelFontSize=12,
    titleFontSize=12
).configure_legend(
    labelFontSize=12,
    titleFontSize=12
)

# tipo.save('./graphs/tipo/tipos_por_ano-colors_per_family.html')

# tipo

## Types per Genus 

same graph as above, with gender on Y axis and colored by type

In [35]:
# subsetting
teste = NewTable[['min_depth','family','infraorder', 'start_year', 'qualifier', 'catalog_number', 
                  'genus', 'species', 'type_status']].copy()

# grouping by type, year and order
temp = teste.groupby(['type_status','start_year', 'genus', 'family']).count()['infraorder'].reset_index().rename(columns={
    'infraorder':'counts'
})

# p.s.: Cótipo and Topótipo are not types
temp = temp[~(temp['type_status'].isin(['Cótipo', 'Topótipo', 'Material tipo', 'Tipo']))]

In [36]:
cores_padrao = ['#e45756', '#4c78a8', '#f58518']
tipos = ['Holotype', 'Paratype', 'Neotype']

In [38]:
tipo = alt.Chart(temp, height=1000, width= 400, title='Types per Genus').mark_point(filled=False).encode(
    x = alt.X('start_year:O', title='Sampling Year'),
    y = alt.Y('genus:N', title= 'Genus',
              sort=alt.EncodingSortField('counts', op='count', order='descending')),
    color= alt.Color('family:N', title='Family',
                    scale= alt.Scale(domain=list(cores_familia_naive.keys()),
                                     range=list(cores_familia_naive.values())),
                    legend= alt.Legend(columns=3, symbolLimit=102, symbolType= 'circle')), 
    size= alt.Size('counts', scale= alt.Scale(range=[10,500]),
                   legend= alt.Legend(columns=5)),
    order= alt.Order('counts', sort='descending'),  # smaller points in front
    shape= alt.Shape('type_status:N', title='Type', legend=alt.Legend(columns=5)), 
#                     scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                      range=['triangle', 'square', 'circle'])),
#     opacity= alt.Opacity('type:N', scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                                      range=[1, 0.5, 1])),
    tooltip= [alt.Tooltip('type_status', title='type'),
              alt.Tooltip('start_year', title='start year'),
              alt.Tooltip('counts', title='counts')]
)

tipo = tipo.configure_title(fontSize=16).configure_axis(
    labelFontSize=12,
    titleFontSize=12
).configure_legend(
    labelFontSize=12,
    titleFontSize=12
)

# tipo.save('./graphs/tipo/tipos_por_genero.html')

# tipo

In [39]:
genus_order = list(temp.groupby(['genus']).min()['start_year'].reset_index().sort_values('start_year')['genus'])

In [45]:
tipo = alt.Chart(temp, height=1000, width= 400, title='Types per Genus').mark_point(filled=False).encode(
    x = alt.X('start_year:O', title='Sampling Year'),
    y = alt.Y('genus:N', title= 'Genus',
              sort=genus_order),
    color= alt.Color('family:N', title='Family',
                    scale= alt.Scale(domain=list(cores_familia_naive.keys()),
                                     range=list(cores_familia_naive.values())),
                    legend= alt.Legend(columns=3, symbolLimit=102, symbolType= 'circle')), 
    size= alt.Size('counts', scale= alt.Scale(range=[10,500]),
                   legend= alt.Legend(columns=5)),
    order= alt.Order('counts', sort='descending'),  # smaller points in front
    shape= alt.Shape('type_status:N', title='Type', legend=alt.Legend(columns=5)), 
#                     scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                      range=['triangle', 'square', 'circle'])),
#     opacity= alt.Opacity('type:N', scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                                      range=[1, 0.5, 1])),
    tooltip= [alt.Tooltip('type_status', title='type'),
              alt.Tooltip('start_year', title='start year'),
              alt.Tooltip('counts', title='counts')]
)

tipo = tipo.configure_title(fontSize=16).configure_axis(
    labelFontSize=12,
    titleFontSize=12
).configure_legend(
    labelFontSize=12,
    titleFontSize=12
)

# tipo.save('./graphs/tipo/tipos_por_genero-primeiro_ano.html')

# tipo

## Types per determiner

In [69]:
# subsetting
teste = NewTable[['min_depth','family','order', 'start_year', 'qualifier', 'catalog_number', 
                  'determiner_full_name', 'species', 'type_status']].copy()

# grouping by type, year and order
temp = teste.groupby(['type_status','start_year', 'determiner_full_name', 'family']).count()['order'].reset_index().rename(columns={
    'order':'counts'
})

# p.s.: Cótipo and Topótipo are not types
temp = temp[~(temp['type_status'].isin(['Cótipo', 'Topótipo', 'Material tipo', 'Tipo']))]

In [70]:
determiner_order = list(temp.groupby(['determiner_full_name']).min(
    )['start_year'].reset_index().sort_values('start_year')['determiner_full_name'])

In [73]:
tipo = alt.Chart(temp, height=800, width= 500, title='Types per Determiner').mark_point(filled=False).encode(
    x = alt.X('start_year:O', title='Sampling Year'),
    y = alt.Y('determiner_full_name:N', title= 'Determiner',
              sort=determiner_order),
    color= alt.Color('family:N', title='Family',
                    scale= alt.Scale(domain=list(cores_familia_naive.keys()), 
                                     range=list(cores_familia_naive.values())),
                    legend= alt.Legend(columns=3, symbolLimit=102)), 
    size= alt.Size('counts:Q', scale=alt.Scale(range=[10,500]),
                   legend= alt.Legend(columns=5)),
    order= alt.Order('counts', sort='descending'),  # smaller points in front
    shape= alt.Shape('type_status:N', title='Type',
                    legend= alt.Legend(columns=5)), 
#                     scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                      range=['triangle', 'square', 'circle'])),
#     opacity= alt.Opacity('type_status:N', scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                                      range=[1, 0.5, 1])),
    tooltip= [alt.Tooltip('determiner_full_name', title='Determiner'),
              alt.Tooltip('type_status', title='type'),
              alt.Tooltip('start_year', title='start year'),
              alt.Tooltip('counts', title='counts')]
)

tipo = tipo.configure_title(fontSize=16).configure_axis(
    labelFontSize=12,
    titleFontSize=12
).configure_legend(
    labelFontSize=12,
    titleFontSize=12
)

# tipo.save('./graphs/tipo/tipos_por_determinador-primeiro_ano.html')

# tipo

## Types per family

In [74]:
# subsetting
teste = NewTable[['min_depth','family','order', 'start_year', 'qualifier', 'catalog_number', 
                  'genus', 'species', 'type_status']].copy()

# grouping by type, year and order
temp = teste.groupby(['type_status','start_year', 'family', 'order']).count()['genus'].reset_index().rename(columns={
    'genus':'counts'
})

# p.s.: Cótipo and Topótipo are not types
temp = temp[~(temp['type_status'].isin(['Cótipo', 'Topótipo', 'Material tipo', 'Tipo']))]

In [75]:
family_order = list(temp.groupby(['family']).min(
    )['start_year'].reset_index().sort_values('start_year')['family'])

In [79]:
tipo = alt.Chart(temp, height=800, width= 500, title='Types per Family').mark_point(filled=False).encode(
    x = alt.X('start_year:O', title='Sampling Year'),
    y = alt.Y('family:N', title= 'Family',
              sort=determiner_order),
    color= alt.Color('family:N', title='Family',
                    scale= alt.Scale(domain=list(cores_familia_naive.keys()), 
                                     range=list(cores_familia_naive.values())),
                    legend= alt.Legend(columns=3, symbolLimit=102)), 
    size= alt.Size('counts:Q', scale=alt.Scale(range=[10,500]),
                   legend= alt.Legend(columns=5)),
    order= alt.Order('counts', sort='descending'),  # smaller points in front
    shape= alt.Shape('type_status:N', title='Type',
                    legend= alt.Legend(columns=5)), 
#                     scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                      range=['triangle', 'square', 'circle'])),
#     opacity= alt.Opacity('type_status:N', scale= alt.Scale(domain=['Holotype', 'Neotype','Paratype'],
#                                                      range=[1, 0.5, 1])),
    tooltip= [alt.Tooltip('family', title='Family'),
              alt.Tooltip('type_status', title='type'),
              alt.Tooltip('start_year', title='start year'),
              alt.Tooltip('counts', title='counts')]
)

tipo = tipo.configure_title(fontSize=16).configure_axis(
    labelFontSize=12,
    titleFontSize=12
).configure_legend(
    labelFontSize=12,
    titleFontSize=12
)

# tipo.save('./graphs/tipo/tipos_por_familia.html')

# tipo

<br>

**The end!**

-----