**Question 1: Predicting Wine Quality**

You are given a dataset of wine samples, which includes the following features:

Data Link: <https://archive.ics.uci.edu/ml/datasets/Wine+Quality>

- fixed acidity

- volatile acidity

- citric acid

- residual sugar

- chlorides

- free sulfur dioxide

- total sulfur dioxide

- density

- pH

- sulphates

- alcohol

- quality (score between 0 and 10)

Your task is to build a machine learning model that predicts the quality of wine based on the other features.

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load data
data = pd.read_csv("winequality-red.csv", sep=";")

# Preprocess data
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Evaluate model
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("MSE:", mse)

MSE: 0.390025143963954


**Question 2: Fraud Detection**

Data Link: <https://www.kaggle.com/mlg-ulb/creditcardfraud>

You are given a dataset of credit card transactions, which includes the following features:

- transaction amount

- transaction date and time

- transaction location

- cardholder name

- card number

- merchant name

- merchant category

- fraud (0 if not fraud, 1 if fraud)

Your task is to build a machine learning model that predicts whether a transaction is fraudulent or not, based on the other features.

In [2]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score

# Load data
data = pd.read_csv("creditcard.csv")

# Preprocess data
X = data.drop("Class", axis=1)
y = data["Class"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LogisticRegression()
model.fit(X_train, y_train)

# Evaluate model
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)

Accuracy: 0.9986657771847899
Precision: 0.6222222222222222
Recall: 0.5714285714285714


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


**Question 3: Image Classification**

Data Link: <https://www.kaggle.com/c/dogs-vs-cats/data>

You are given a dataset of images, each of which belongs to one of five categories: cat, dog, bird, fish, or horse. The dataset contains 10,000 images, evenly distributed across the categories.

Your task is to build a machine learning model that can classify images into their respective categories.

In [3]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Preprocess data
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_dir = "train/"
test_dir = "test/"

train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

test_generator = test_datagen.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

# Train model
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.fit(train_generator, epochs=10, validation_data=test_generator)

# Evaluate model
test_loss, test_acc = model.evaluate(test_generator)
print("Test accuracy:", test_acc)


ModuleNotFoundError: No module named 'tensorflow'