![title](CBMpy.png)

**INSTITUTO NACIONAL DE PESQUISAS ESPACIAIS** 

Disciplina: Introdution to Data Science
    
Professores: Rafael Santos e Gilberto Queiroz
    
Acadêmica: Marcelly Homem Coelho
    
Contato: marcellyhc@gmail.com 

**Título:** Aplicação de Técnicas de Data Science no Desenvolvimento de um Sistema para Manutenção Aeronáutica Baseada em Condição 

**Descrição:** Este programa tem como objetivo analisar as mensagens de falha e as remoções dos sistemas das aeronaves.

In [1]:
# Importar as bibliotecas

import numpy as np
import pandas as pd
import seaborn as sns

import random

import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(connected=True)

import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
# Função generate_color é utilizada para gerar cores aleatórias

def generate_color():
    color = '#{:02x}{:02x}{:02x}'.format(*map(lambda x: random.randint(0, 255), range(3)))
    return color

# 1. Investigação Inicial da Estrutura e Conteúdo do Arquivo de Falha

In [3]:
# Criar um dataframe para entrada de dados de falha (arquivo do tipo .csv)

df_Failure = pd.read_csv('CBMpy_dataFailureCode.csv')   

In [4]:
# Exibir o cabeçalho do dataframe

df_Failure.head()

Unnamed: 0,Aircraft,Flight Phase,Date,Fault Text,Maintenance Message
0,2640,,2006-05-14 16:19:00,FDE_Outhers02,
1,2640,Enroute Cruise,2006-07-01 15:17:00,FDE_B_System3,MMSG_A_System3
2,2640,Enroute Cruise,2006-07-01 15:17:00,FDE_C_System3,MMSG_A_System3
3,2640,Enroute Cruise,2006-07-02 04:48:00,FDE_B_System3,MMSG_A_System3
4,2640,Enroute Cruise,2006-07-02 04:48:00,FDE_C_System3,MMSG_A_System3


$\color{blue}{\text{OBSERVAÇÃO:}}$ 

A Fig. 1 destaca o Sistema de Indicação do Motor e Alerta de Tripulação - Engine Indication and Crew Alerting System (EICAS) - é definido como um sistema da aeronave responsável por exibir parâmetros do motor e alertar a tripulação sobre a configuração ou falhas do sistema.
No EICAS existe três modos de exibição de informações, são eles: 
    - Modo operacional: apresenta as informações operacionais e alertas que exigem ação da tripulação em voo.
    - Modo de status: exibe parâmetros de subsistemas e mensagens de status dos equipamentos/componentes. No df_dataFailure é representado pela coluna 'Fault Text'. 
    - Modo de manutenção: fornece aos responsáveis pela manutenção informações para auxiliar na detecção de falhas e nos testes de verificação de subsistemas. No df_dataFailure é representado pela coluna 'Maintenance Message'.  

![title](FDE_CAS.png)

Fig. 1: Crew Alerting System.

$\color{blue}{\text{OBSERVAÇÃO:}}$ 

As aeronaves possuem um Painel de Controle de Manutenção, conforme a Fig. 2, é usado por responsáveis pela manutenção para exibir dados de manutenção armazenados na memória. No df_dataFailure é representado pela coluna 'Maintenance Message'.

![title](PainelControleManutenção.png)

Fig. 2: Maintenance Control Panel .

In [5]:
# Verifica a dimensão do dataframe (qtd linhas, qtd colunas)

df_Failure.shape

(7238, 5)

In [6]:
# Verificar o tipo de dado de cada coluna do dataframe

df_Failure.dtypes

Aircraft                int64
Flight Phase           object
Date                   object
Fault Text             object
Maintenance Message    object
dtype: object

In [7]:
# Converter os dados da coluna 'Date' para o formato de data-hora

df_Failure['Date'] =  pd.to_datetime(df_Failure['Date'], format='%Y/%m/%d %H:%M')

In [8]:
# Verificar o tipo de dado de cada coluna do dataframe

df_Failure.dtypes

Aircraft                        int64
Flight Phase                   object
Date                   datetime64[ns]
Fault Text                     object
Maintenance Message            object
dtype: object

In [9]:
# Determinar quantas 'Flight Phase' diferentes há no dataframe

len(df_Failure['Aircraft'].unique())

15

In [10]:
df_Failure['Aircraft'].value_counts()

2766    1059
1950     728
1990     668
1151     626
2640     560
326      475
131      465
2436     421
1419     419
791      417
2838     369
2209     324
312      268
1710     243
2982     196
Name: Aircraft, dtype: int64

$\color{blue}{\text{OBSERVAÇÃO:}}$ O Aircraft igual a 2766 é o que apresenta a maior quantidade de Fault Text. Além disso, é o segundo em relação a quantidade de remoções de peças.

## 1.1 Agrupamento do Conjunto de Dados de Fault Text para todos os Aircraft

Este dataframe será usado posteriormente para detectar as concentrações de todas as Fault Text de todos os Aircraft.

In [11]:
# Exibir o cabeçalho do dataframe

df_Failure.head()

Unnamed: 0,Aircraft,Flight Phase,Date,Fault Text,Maintenance Message
0,2640,,2006-05-14 16:19:00,FDE_Outhers02,
1,2640,Enroute Cruise,2006-07-01 15:17:00,FDE_B_System3,MMSG_A_System3
2,2640,Enroute Cruise,2006-07-01 15:17:00,FDE_C_System3,MMSG_A_System3
3,2640,Enroute Cruise,2006-07-02 04:48:00,FDE_B_System3,MMSG_A_System3
4,2640,Enroute Cruise,2006-07-02 04:48:00,FDE_C_System3,MMSG_A_System3


In [12]:
# Identifica todas as "Fault Text" existentes de todos os Aircraft
array_FDE_All = np.array(df_Failure['Fault Text'].unique())

# Exclui os itens NaN do array
array_FDE_All = array_FDE_All[~pd.isnull(array_FDE_All)]  

# Exibe o array
array_FDE_All

array(['FDE_Outhers02', 'FDE_B_System3', 'FDE_C_System3', 'FDE_C_System4',
       'FDE_B_System4', 'FDE_G_System3', 'FDE_E_System4', 'FDE_A_System4',
       'FDE_F_System4', 'FDE_G_System4', 'FDE_F_System3', 'FDE_E_System1',
       'FDE_A_System1', 'FDE_A_System3', 'FDE_D_System3', 'FDE_Outhers10',
       'FDE_Outhers11', 'FDE_Outhers12', 'FDE_G_System2', 'FDE_Outhers03',
       'FDE_Outhers00', 'FDE_Outhers04', 'FDE_E_System3', 'FDE_Outhers01',
       'FDE_G_System1', 'FDE_F_System1', 'FDE_Outhers09', 'FDE_Outhers05',
       'FDE_D_System4', 'FDE_H_System4', 'FDE_H_System1', 'FDE_C_System2',
       'FDE_B_System2', 'FDE_D_System1', 'FDE_D_System2', 'FDE_Outhers06',
       'FDE_K_System1', 'FDE_K_System4', 'FDE_K_System2', 'FDE_Q_System3',
       'FDE_C_System1', 'FDE_A_System2', 'FDE_E_System2', 'FDE_H_System3',
       'FDE_M_System3', 'FDE_F_System2', 'FDE_Outhers18', 'FDE_B_System1',
       'FDE_Outhers17', 'FDE_J_System3', 'FDE_Outhers16',
       'WINDOW HEAT 1L', 'WINDOW HEAT 1R',

In [13]:
arrayAux = []

df_FDE_All = pd.DataFrame(columns= ['Aircraft','year', 'month', 'day'])

for aux in array_FDE_All:
    
    # Cria um dataframe para cada Fault Text corrente 
    dfMsg = pd.DataFrame(df_Failure[df_Failure['Fault Text'] == aux])   
    
    # Contagem de Fault Text por dia para um determinado tail
    arrayAux = dfMsg.groupby([dfMsg['Aircraft'],
                              dfMsg['Date'].dt.year.rename('year'),
                              dfMsg['Date'].dt.month.rename('month'),
                              dfMsg['Date'].dt.day.rename('day')]).count()['Fault Text']
    
    # Transforma os objetos do groupby para dataframe, depois possibilita fazer o merge
    arrayAux = arrayAux.to_frame().reset_index()
    
    arrayAux.columns = ['Aircraft', 'year', 'month', 'day', aux]
    
    # Realiza o merge dos dataframes, utilizando o método "outer" apropriado para acrescentar colunas
    # e manter os índices compostos de dia-mes-ano conforme referência na documentação
    # https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html  
    df_FDE_All = pd.merge(df_FDE_All, arrayAux, how='outer', on=['Aircraft', 'year','month','day'])

In [14]:
# Exibir o cabeçalho do dataframe

df_FDE_All.head()

Unnamed: 0,Aircraft,year,month,day,FDE_Outhers02,FDE_B_System3,FDE_C_System3,FDE_C_System4,FDE_B_System4,FDE_G_System3,...,FDE_M_System2,FDE_Outhers19,FDE_J_System4,FDE_Outhers14,FDE_P_System4,FDE_O_System2,FDE_O_System3,FDE_J_System2,FDE_L_System1,FDE_J_System1
0,131,2006,8,15,6.0,,,,,,...,,,,,,,,,,
1,131,2006,8,20,1.0,,,,,,...,,,,,,,,,,
2,131,2006,8,24,1.0,,,,,,...,,,,,,,,,,
3,131,2006,9,7,1.0,,,,,,...,,,,,,,,,,
4,131,2007,10,6,2.0,,,,,,...,,,,,,,,,,


In [15]:
# Adicionar uma coluna date no dataframe (coo dados dos campos year, month e day) 

df_FDE_All['Date'] = pd.to_datetime(df_FDE_All.year*10000 + df_FDE_All.month*100 + df_FDE_All.day, format='%Y%m%d') 

In [16]:
# Ordena o dataframe por: year -> month -> day

df_FDE_All = df_FDE_All.sort_values(['year', 'month', 'day'])

In [17]:
# Substituir elementos NaN por zeros 

df_FDE_All = df_FDE_All.fillna(0) 

In [18]:
# Exibir o cabeçalho do dataframe

df_FDE_All.head()

Unnamed: 0,Aircraft,year,month,day,FDE_Outhers02,FDE_B_System3,FDE_C_System3,FDE_C_System4,FDE_B_System4,FDE_G_System3,...,FDE_Outhers19,FDE_J_System4,FDE_Outhers14,FDE_P_System4,FDE_O_System2,FDE_O_System3,FDE_J_System2,FDE_L_System1,FDE_J_System1,Date
795,1151,2006,3,25,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-25
68,1950,2006,4,19,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-19
310,1990,2006,4,19,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-19
798,2766,2006,4,19,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-19
69,1950,2006,4,20,0.0,1.0,2.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-20


## 1.2 Análise das Mensagens de Falha de um Aircraft Específico

In [19]:
# Definir uma variável para a seleção de um arcraft específico

var_aircraftSelected = 2766

In [20]:
# Criar um dataframe para o aircraft específico 

df_Failure_airSelec = df_FDE_All[df_FDE_All['Aircraft'] == var_aircraftSelected]

In [21]:
# Exibir o cabeçalho do dataframe

df_Failure_airSelec.head()

Unnamed: 0,Aircraft,year,month,day,FDE_Outhers02,FDE_B_System3,FDE_C_System3,FDE_C_System4,FDE_B_System4,FDE_G_System3,...,FDE_Outhers19,FDE_J_System4,FDE_Outhers14,FDE_P_System4,FDE_O_System2,FDE_O_System3,FDE_J_System2,FDE_L_System1,FDE_J_System1,Date
798,2766,2006,4,19,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-19
799,2766,2006,4,21,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-21
539,2766,2006,4,22,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-22
540,2766,2006,4,23,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-23
982,2766,2006,4,25,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-25


In [22]:
# Verifica a dimensão do dataframe (qtd linhas, qtd colunas)

df_Failure_airSelec.shape

(151, 82)

### 1.2.1 Gráfico de Série Temporal das Fault Text do Aircraft específico

In [23]:
array_data = []

for aux in array_FDE_All:
    
    trace = go.Bar(x = df_Failure_airSelec['Date'],
                   y = df_Failure_airSelec[aux],
                   name = aux,
                   marker = {'color': generate_color().upper()}) 
    
    # Adicionar o trace no array_data
    array_data.append(trace)
    
    layout = go.Layout(title='Fault Text Graphic',
                       xaxis=dict(tickfont=dict(size=14, color='rgb(107, 107, 107)')),
                       yaxis=dict(title='Quantity', titlefont=dict(size=16, color='rgb(107, 107, 107)'),
                       tickfont=dict(size=14, color='rgb(107, 107, 107)')), 
                       legend=dict(x=-0.5, y=-1.0, bgcolor='rgba(255, 255, 255, 0)',
                       bordercolor='rgba(255, 255, 255, 0)'),
                       barmode='group',
                       bargap=0.15,
                       bargroupgap=0.1)

    fig = dict(data=array_data, layout=layout) 

py.iplot(fig, filename='style-bar')

# 2. Investigação Inicial da Estrutura e Conteúdo do Arquivo de Remoção

In [24]:
# Criar um dataframe para entrada de dados de remoção (arquivo do tipo .csv)

df_Removal = pd.read_csv('CBMpy_dataRemovalCode.csv')  

In [25]:
# Exibir do cabeçalho do dataframe

df_Removal.head()

Unnamed: 0,Aircraft,Component,System,Date,Reason,Time Hours,Time Cycles
0,1140,REM_Component_A,System1,2006-05-29,3,118123,15961
1,1140,REM_Component_A,System1,2006-05-29,3,118123,15961
2,1140,REM_Component_B,System1,2006-05-29,3,1092,139
3,1140,REM_Component_B,System3,2006-06-24,3,312,37
4,1140,REM_Component_B,System3,2006-07-10,3,118698,16028


In [26]:
# Verifica a dimensão do dataframe (qtd linhas, qtd colunas)

df_Removal.shape

(1282, 7)

In [27]:
# Verificar o tipo de dado de cada coluna do dataframe

df_Removal.dtypes

Aircraft         int64
Component       object
System          object
Date            object
Reason           int64
Time Hours       int64
Time Cycles      int64
dtype: object

In [28]:
# Converter os dados da coluna 'Date' para o formato de data

df_Removal['Date'] =  pd.to_datetime(df_Removal['Date'], format='%Y/%m/%d')

In [29]:
# Verificar o tipo de dado de cada coluna do dataframe

df_Removal.dtypes

Aircraft                 int64
Component               object
System                  object
Date            datetime64[ns]
Reason                   int64
Time Hours               int64
Time Cycles              int64
dtype: object

In [30]:
# Determinar quantos 'Component' diferentes há no dataframe

len(df_Removal['Component'].unique())

17

In [31]:
# Verificar quais foram os 'Component' mais trocados 

df_Removal['Component'].value_counts()

REM_Component_B    268
REM_Component_A    210
REM_Component_D    177
REM_Component_F    113
REM_Component_J     87
REM_Component_G     86
REM_Component_H     77
REM_Component_I     64
REM_Component_N     45
REM_Component_E     38
REM_Component_L     37
REM_Component_K     25
REM_Component_O     22
REM_Component_M     19
REM_Component_C      6
REM_Component_P      5
REM_Component_Q      3
Name: Component, dtype: int64

In [32]:
# Verificar quais 'Aircraft' realizaram mais trocas de componentes  

df_Removal['Aircraft'].value_counts()

2640    99
2766    94
2361    92
2326    91
2567    86
1950    78
2982    74
1399    62
2838    60
2436    59
131     55
1151    54
312     53
1990    50
1419    49
736     46
1710    41
2209    38
791     37
326     30
1140    25
165      9
Name: Aircraft, dtype: int64

$\color{blue}{\text{OBSERVAÇÃO:}}$ O Aircraft igual a 2766 é o segundo que apresenta a maior quantidade de remoções de peças. Além disso, é o primeiro em relação a quantidade de Fault Text.

## 2.1 Agrupamento do Conjunto de Dados de Removal para todos os Aircraft

In [33]:
# Exibir do cabeçalho do dataframe

df_Removal.head()

Unnamed: 0,Aircraft,Component,System,Date,Reason,Time Hours,Time Cycles
0,1140,REM_Component_A,System1,2006-05-29,3,118123,15961
1,1140,REM_Component_A,System1,2006-05-29,3,118123,15961
2,1140,REM_Component_B,System1,2006-05-29,3,1092,139
3,1140,REM_Component_B,System3,2006-06-24,3,312,37
4,1140,REM_Component_B,System3,2006-07-10,3,118698,16028


In [34]:
# Identifica todas as "Fault Text" existentes de todos os Aircraft
array_Removal_All = np.array(df_Removal['Component'].unique())

# Exclui os itens NaN do array
array_Removal_All = array_Removal_All[~pd.isnull(array_Removal_All)]  

# Exibe o array
array_Removal_All

array(['REM_Component_A', 'REM_Component_B', 'REM_Component_C',
       'REM_Component_D', 'REM_Component_E', 'REM_Component_F',
       'REM_Component_G', 'REM_Component_H', 'REM_Component_I',
       'REM_Component_J', 'REM_Component_K', 'REM_Component_L',
       'REM_Component_M', 'REM_Component_N', 'REM_Component_O',
       'REM_Component_P', 'REM_Component_Q'], dtype=object)

In [35]:
# Realizar o merge do dataframe (agrupamento por data)

arrayY = []

df_Removal_All = pd.DataFrame(columns= ['Aircraft','year', 'month', 'day'])

for aux in array_Removal_All:
    
    # Cria um dataframe para um Fault Text corrente 
    dfMsg = pd.DataFrame(df_Removal[df_Removal['Component'] == aux])
    
    # Contar as Fault Text por dia para o aircraft selecionado
    arrayY = dfMsg.groupby([dfMsg['Aircraft'],
                            dfMsg['Date'].dt.year.rename('year'),
                            dfMsg['Date'].dt.month.rename('month'),
                            dfMsg['Date'].dt.day.rename('day')]).count()['Component']
    
    # Transformar os objetos do groupby para dataframe (depois possibilita fazer o merge).
    arrayY = arrayY.to_frame().reset_index()
    
    arrayY.columns = ['Aircraft', 'year', 'month', 'day', aux]
       
    # Utilizar o método "outer" (apropriado para acrescentar colunas e manter os índices compostos de dia-mes-ano). 
    df_Removal_All = pd.merge(df_Removal_All, arrayY, how='outer', on=['Aircraft', 'year','month','day'])

In [36]:
# Exibir o cabeçalho do dataframe

df_Removal_All.head()

Unnamed: 0,Aircraft,year,month,day,REM_Component_A,REM_Component_B,REM_Component_C,REM_Component_D,REM_Component_E,REM_Component_F,...,REM_Component_H,REM_Component_I,REM_Component_J,REM_Component_K,REM_Component_L,REM_Component_M,REM_Component_N,REM_Component_O,REM_Component_P,REM_Component_Q
0,131,2006,4,9,1.0,,,,,,...,,,,,,,,,,
1,131,2007,9,15,1.0,,,,,,...,,,,,,,,,,
2,131,2007,9,20,1.0,,,,,,...,,,,,,,,,,
3,131,2007,10,6,1.0,1.0,,,,,...,,,,,1.0,,,,,
4,131,2009,1,6,1.0,,,1.0,1.0,,...,,,1.0,,,,,,,


In [37]:
# Adicionar uma coluna date no dataframe (coo dados dos campos year, month e day) 

df_Removal_All['Date'] = pd.to_datetime(df_Removal_All.year*10000 + df_Removal_All.month*100 + df_Removal_All.day, format='%Y%m%d') 

In [38]:
# Ordena o dataframe por: year -> month -> day

df_Removal_All = df_Removal_All.sort_values(['year', 'month', 'day'])

In [39]:
# Substituir elementos NaN por zeros 

df_Removal_All = df_Removal_All.fillna(0) 

In [40]:
# Exibir o cabeçalho do dataframe

df_Removal_All.head()

Unnamed: 0,Aircraft,year,month,day,REM_Component_A,REM_Component_B,REM_Component_C,REM_Component_D,REM_Component_E,REM_Component_F,...,REM_Component_I,REM_Component_J,REM_Component_K,REM_Component_L,REM_Component_M,REM_Component_N,REM_Component_O,REM_Component_P,REM_Component_Q,Date
158,131,2006,3,19,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-19
455,2766,2006,3,19,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-19
271,2209,2006,3,23,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-23
311,2766,2006,3,23,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-23
126,2640,2006,3,28,2.0,0.0,0.0,1.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-28


## 2.2 Análise das Remoções de um Aircraft Específico

In [41]:
# Definir uma variável para a seleção de um arcraft específico

#var_aircraftSelected = 2766

var_aircraftSelected

2766

In [42]:
# Criar um dataframe para o aircraft específico 

df_Removal_airSelec = df_Removal_All[df_Removal_All['Aircraft'] == var_aircraftSelected]

In [43]:
# Exibir o cabeçalho do dataframe

df_Removal_airSelec.head()

Unnamed: 0,Aircraft,year,month,day,REM_Component_A,REM_Component_B,REM_Component_C,REM_Component_D,REM_Component_E,REM_Component_F,...,REM_Component_I,REM_Component_J,REM_Component_K,REM_Component_L,REM_Component_M,REM_Component_N,REM_Component_O,REM_Component_P,REM_Component_Q,Date
455,2766,2006,3,19,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-19
311,2766,2006,3,23,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-03-23
134,2766,2006,4,19,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-19
312,2766,2006,4,23,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-04-23
313,2766,2006,5,25,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2006-05-25


In [44]:
# Verifica a dimensão do dataframe (qtd linhas, qtd colunas)

df_Removal_airSelec.shape

(72, 22)

### 2.2.1 Gráfico de Série Temporal das Removal do Aircraft específico

In [45]:
array_data = []

for aux in array_Removal_All:
    
    trace = go.Bar(x = df_Removal_airSelec['Date'],
                   y = df_Removal_airSelec[aux],
                   name = aux,
                   marker = {'color': generate_color().upper()}) 
    
    # Adicionar o trace no array_data
    array_data.append(trace)
    
    layout = go.Layout(title='Removal Graphic',
                       xaxis=dict(tickfont=dict(size=14, color='rgb(107, 107, 107)')),
                       yaxis=dict(title='Quantity', titlefont=dict(size=16, color='rgb(107, 107, 107)'),
                       tickfont=dict(size=14, color='rgb(107, 107, 107)')), 
                       legend=dict(x=-0.5, y=-1.0, bgcolor='rgba(255, 255, 255, 0)',
                       bordercolor='rgba(255, 255, 255, 0)'),
                       barmode='group',
                       bargap=0.15,
                       bargroupgap=0.1)

    fig = dict(data=array_data, layout=layout) 

py.iplot(fig, filename='style-bar')

Imagem do gráfico interativo:
![title](plot_removal.png)

# 4. Detecção de falha 

Uma falha é caracterizada pela concentração frequente de FDEs.

In [46]:
# Exibir o cabeçalho do dataframe

df_Failure.head()

Unnamed: 0,Aircraft,Flight Phase,Date,Fault Text,Maintenance Message
0,2640,,2006-05-14 16:19:00,FDE_Outhers02,
1,2640,Enroute Cruise,2006-07-01 15:17:00,FDE_B_System3,MMSG_A_System3
2,2640,Enroute Cruise,2006-07-01 15:17:00,FDE_C_System3,MMSG_A_System3
3,2640,Enroute Cruise,2006-07-02 04:48:00,FDE_B_System3,MMSG_A_System3
4,2640,Enroute Cruise,2006-07-02 04:48:00,FDE_C_System3,MMSG_A_System3
