### Task
In recent years, drones have become increasingly integral to various industries, from aerial photography and surveillance to package delivery and agriculture. As drones con- tinue to play a vital role in these sectors, optimizing their flight performance has become paramount. One critical aspect of drone operation is accurately predicting the flight du- ration for a given mission. Understanding how long a drone can remain airborne under specific conditions is essential for planning missions, optimizing resources, and ensur- ing operational efficiency.
In this scenario, the task is to develop a model that can predict the remaining flight time of a drone based on various parameters. This model will enable drone operators to estimate how long a drone can stay in the air before needing to land and recharge.
Your dataset is a comma-separated values file named training.csv that encompasses a wide range of attributes relevant to drone operation and performance evaluation. It includes parameters influencing a battery’s reliability and longevity such as battery volt- age (V), temperature (°C), capacity (Ah), health (%), and age (yrs). Additionally, the bat- tery charging pattern indicates how frequently the battery is charged. Environmental factors like wind speed (m/s2), humidity (%), and temperature (°C) are also included, alongside navigational parameters such as GPS signal strength (%), air traffic density, obstacle density, altitude (m), and drone speed (m/s2), which influence flight planning and safety. Additional data points cover flight modes (0: ’Hovering’, 1: ’Cruising’, 2: ’As- cending’, 3: ’Descending’), and payload type (0: ’Low’, 1: ’Medium’, 2: ’High’).

### Submission File Format
The submitted file must be in NumPy format with the .npy extension. The array must be of shape (2000,), where each element is a predicted remaining flight time for the testing dataset provided in test.csv. Indices must match, i.e., the first element of the solution array must correspond to the first row, and so on.

In [53]:
# import pandas as pd
# from sklearn.preprocessing import MinMaxScaler, OneHotEncoder, MaxAbsScaler
# from tensorflow.keras.layers import Input, Dense, concatenate
# from tensorflow.keras.models import Model

# # Load the data
# df = pd.read_csv("train.csv")

# # Extract features and target variable
# X = df.drop("time_remaining", axis=1)
# y = df["time_remaining"]

# # Define categorical and numerical columns
# categorical_cols = ["firmware_version", "drone_model_number", "payload_type", "country", "drone_color", "flight_mode", "manufacturer_name"]
# numerical_cols = [col for col in X.columns if col not in categorical_cols]

# # Preprocess date column
# X['manufacturing_date'] = pd.to_datetime(X['manufacturing_date'])
# reference_date = X['manufacturing_date'].min()
# X['manufacturing_date'] = (X['manufacturing_date'] - reference_date).dt.days

# # One-hot encode categorical features
# encoder = OneHotEncoder(handle_unknown='ignore')
# X_encoded = encoder.fit_transform(X[categorical_cols]).toarray()
# X_encoded = pd.DataFrame(X_encoded, columns=encoder.get_feature_names_out(categorical_cols))

# # Normalize numerical features
# scaler = MinMaxScaler()
# X_scaled = scaler.fit_transform(X[numerical_cols])
# X_scaled = pd.DataFrame(X_scaled, columns=numerical_cols)

# # Concatenate encoded categorical features and normalized numerical features
# X_processed = pd.concat([X_encoded, X_scaled], axis=1)

# # Define input layers for each feature
# inputs = []
# for i in range(X_processed.shape[1]):
#     input_layer = Input(shape=(1,), name=f"input_{i}")
#     inputs.append(input_layer)

# # Concatenate all input layers
# concatenated = concatenate(inputs)

# # Dense layers for processing concatenated inputs
# x = Dense(64, activation='relu')(concatenated)
# x = Dense(32, activation='relu')(x)

# # Output layer
# output = Dense(1)(x)

# # Create the model
# model = Model(inputs=inputs, outputs=output)

# # Compile the model
# model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# # Train the model
# model.fit([X_processed.iloc[:, i] for i in range(X_processed.shape[1])], y, epochs=10, batch_size=1, validation_split=0.2)

# # Make predictions
# predictions = model.predict([X_processed.iloc[:, i] for i in range(X_processed.shape[1])])


Epoch 1/100
[1m8000/8000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 615us/step - loss: 80241.9688 - mae: 135.4753 - val_loss: 1670.2935 - val_mae: 32.7425
Epoch 2/100
[1m8000/8000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 617us/step - loss: 1598.8933 - mae: 32.0364 - val_loss: 1249.9773 - val_mae: 28.3850
Epoch 3/100
[1m8000/8000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 595us/step - loss: 1158.3099 - mae: 27.2373 - val_loss: 987.6422 - val_mae: 24.8919
Epoch 4/100
[1m8000/8000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 602us/step - loss: 859.8702 - mae: 23.3086 - val_loss: 674.1179 - val_mae: 20.6842
Epoch 5/100
[1m8000/8000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 610us/step - loss: 598.3976 - mae: 19.4645 - val_loss: 414.2246 - val_mae: 16.3152
Epoch 6/100
[1m8000/8000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 594us/step - loss: 382.2837 - mae: 15.5262 - val_loss: 347.7948 - val_mae: 15.0631
Epoch 7/100
[1

KeyboardInterrupt: 

In [84]:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder, MaxAbsScaler
from tensorflow.keras.layers import Input, Dense, concatenate
from tensorflow.keras.models import Model

# Load the data
df = pd.read_csv("train.csv")

# Extract features and target variable
X = df.drop("time_remaining", axis=1)
y = df["time_remaining"]

# Define categorical and numerical columns
categorical_cols = ["firmware_version", "drone_model_number", "payload_type", "country", "drone_color", "flight_mode", "manufacturer_name"]
numerical_cols = [col for col in X.columns if col not in categorical_cols]

# Preprocess date column
X['manufacturing_date'] = pd.to_datetime(X['manufacturing_date'])
reference_date = X['manufacturing_date'].min()
X['manufacturing_date'] = (X['manufacturing_date'] - reference_date).dt.days

# One-hot encode categorical features
encoder = OneHotEncoder(handle_unknown='ignore')
X_encoded = encoder.fit_transform(X[categorical_cols]).toarray()
X_encoded = pd.DataFrame(X_encoded, columns=encoder.get_feature_names_out(categorical_cols))

# Normalize numerical features
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X[numerical_cols])
X_scaled = pd.DataFrame(X_scaled, columns=numerical_cols)

# Concatenate encoded categorical features and normalized numerical features
X_processed = pd.concat([X_encoded, X_scaled], axis=1)

# Define input layers for each feature
inputs = []
for i in range(X_processed.shape[1]):
    input_layer = Input(shape=(1,), name=f"input_{i}")
    inputs.append(input_layer)

# Concatenate all input layers
concatenated = concatenate(inputs)

# Dense layers for processing concatenated inputs
x = Dense(64, activation='relu')(concatenated)
x = Dense(32, activation='relu')(concatenated)

# Output layer
output = Dense(1)(x)

# Create the model
model = RandomForestRegressor(n_estimators=100, random_state=42)

# Compile the model
# model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# Train the model
# model.fit([X_processed.iloc[:, i] for i in range(X_processed.shape[1])], y, epochs=500, batch_size=10, validation_split=0.2)
# model.fit(X_processed, y, epochs=500, batch_size=10, validation_split=0.2)

model.fit(X_train, y_train)


# Make predictions
predictions = model.predict([X_processed.iloc[:, i] for i in range(X_processed.shape[1])])


ValueError: could not convert string to float: 'v5.0.0'

In [82]:
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import LabelEncoder

# Step 1: Load and preprocess the training data
train_df = pd.read_csv("train.csv")

# Drop irrelevant columns
train_df.drop(columns=["drone_serial_number", "manufacturing_date"], inplace=True)

# Encode categorical variables
encoder = LabelEncoder()
train_df['battery_charging_pattern'] = encoder.fit_transform(train_df['battery_charging_pattern'])
train_df['flight_mode'] = encoder.fit_transform(train_df['flight_mode'])
train_df['payload_type'] = encoder.fit_transform(train_df['payload_type'])

# Separate features and target variable
X_train = train_df.drop(columns=["time_remaining"])
y_train = train_df["time_remaining"]

# Step 2: Train a Random Forest model
rf_model = RandomForestRegressor(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)

# Step 3: Load and preprocess the testing data
test_df = pd.read_csv("test.csv")

# Drop irrelevant columns
test_df.drop(columns=["drone_serial_number", "manufacturing_date"], inplace=True)

# Encode categorical variables
test_df['battery_charging_pattern'] = encoder.transform(test_df['battery_charging_pattern'])
test_df['flight_mode'] = encoder.transform(test_df['flight_mode'])
test_df['payload_type'] = encoder.transform(test_df['payload_type'])

# Step 4: Predict the remaining flight time for the testing data
predictions = rf_model.predict(test_df)

# Step 5: Save predictions in NumPy format
np.save("predictions.npy", predictions)


ValueError: could not convert string to float: 'v5.0.0'

In [None]:

# Load the test data
test_df = pd.read_csv("test.csv")

# Preprocess the test data
X_test = test_df.copy()

# Preprocess date column
X_test['manufacturing_date'] = pd.to_datetime(X_test['manufacturing_date'])
X_test['manufacturing_date'] = (X_test['manufacturing_date'] - reference_date).dt.days

# One-hot encode categorical features
X_test_encoded = encoder.transform(X_test[categorical_cols]).toarray()
X_test_encoded = pd.DataFrame(X_test_encoded, columns=encoder.get_feature_names_out(categorical_cols))

# Normalize numerical features
X_test_scaled = scaler.transform(X_test[numerical_cols])
X_test_scaled = pd.DataFrame(X_test_scaled, columns=numerical_cols)

# Concatenate encoded categorical features and normalized numerical features
X_test_processed = pd.concat([X_test_encoded, X_test_scaled], axis=1)

# Make predictions
test_predictions = model.predict([X_test_processed.iloc[:, i] for i in range(X_test_processed.shape[1])])

# Flatten predictions array
test_predictions = test_predictions.flatten()

# Save predictions as .npy file
np.save("test_predictions.npy", test_predictions)


[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step
