Load the training, test data, and sample submission files using pandas.

In [None]:
import pandas as pd
train_data = pd.read_csv('train.csv')
test_data = pd.read_csv('test.csv')
sample_submission = pd.read_csv('sample_submission.csv')

Check the shapes and structure of the training and test datasets.

In [None]:
train_data.info()
test_data.info()

Perform data exploration by visualizing the distribution of the target variable.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.countplot(data=train_data, x='target_variable')
plt.show()

Preprocess the data by handling missing values in train and test datasets.

In [None]:
train_data.dropna(inplace=True)
test_data.fillna(method='ffill', inplace=True)

Prepare the training data by separating features and target variable.

In [None]:
X = train_data.drop('target_variable', axis=1)
y = train_data['target_variable']

Split the data into training and validation sets.

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

Initialize the model for training.

In [None]:
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()

Fit the model using the training data.

In [None]:
model.fit(X_train, y_train)

Use the trained model to make predictions on the test data.

In [None]:
predictions = model.predict(test_data)

Evaluate the model's performance using accuracy metric.

In [None]:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_val, model.predict(X_val))
print(f'Accuracy: {accuracy}')

Create a submission file with predictions for the test data.

In [None]:
submission = pd.DataFrame({'Id': test_data['Id'], 'Predicted': predictions})
submission.to_csv('submission.csv', index=False)