# NASA C-MAPSS Turbofan Engine Degradation Analysis

Analyse du dataset NASA pour la prediction de la duree de vie utile restante (RUL) des moteurs turbofan.

## Structure des donnees

- **unit_id**: Identifiant du moteur
- **cycle**: Cycle de fonctionnement
- **op_setting_1, 2, 3**: Parametres operationnels
- **sensor_1 to sensor_21**: Mesures des 21 capteurs

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

sns.set_theme(style="whitegrid")
pd.set_option('display.max_columns', 30)

## 1. Chargement des donnees

In [None]:
# Noms des colonnes
columns = ['unit_id', 'cycle', 'op_setting_1', 'op_setting_2', 'op_setting_3'] + \
          [f'sensor_{i}' for i in range(1, 22)]

# Charger le dataset FD001 (plus simple pour commencer)
train_df = pd.read_csv('data/CMaps/train_FD001.txt', sep='\s+', header=None, names=columns)
test_df = pd.read_csv('data/CMaps/test_FD001.txt', sep='\s+', header=None, names=columns)
rul_df = pd.read_csv('data/CMaps/RUL_FD001.txt', sep='\s+', header=None, names=['RUL'])

print(f"Train shape: {train_df.shape}")
print(f"Test shape: {test_df.shape}")
print(f"RUL shape: {rul_df.shape}")

In [None]:
train_df.head(10)

In [None]:
train_df.describe()