Lesson 10: Introduction to Basic Machine Learning with Python

## Introduction to Machine Learning
Machine Learning (ML) is the science of programming computers to learn from data. It is a branch of artificial intelligence (AI) based on the idea that systems can learn from data, identify patterns, and make decisions with minimal human intervention.

## Basic Machine Learning Application: Iris Classification
We will use a simple example to understand the basic process of training a machine learning model. Here we use a simple dataset called Iris, it's a dataset involving iris flower species. The dataset contains a set of 150 records of iris flowers under five attributes - petal length, petal width, sepal length, sepal width, and species. We're going to predict the species of iris using certain features given in the dataset.

Now, let's start by importing necessary libraries.

In [1]:
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report, accuracy_score

Let's load the iris dataset.

In [2]:
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target

Now we separate the features and target variables and then split our data into train and test sets.

In [3]:
# Separate features and target
X = df.iloc[:, :-1].values
Y = df.iloc[:, -1].values

# Split data into training and test sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=1)

We're going to use a Decision Tree Classifier for this task. Let's create an instance of it and fit our model on the training data.

In [4]:
# Create an instance of DecisionTreeClassifier
model = DecisionTreeClassifier()

# Fit the model
model.fit(X_train, Y_train)

After training the model, we make predictions on the test data.

In [5]:
# Predict the response for test dataset
Y_pred = model.predict(X_test)

Finally, we evaluate the model by checking the accuracy and the detailed report of our classification.

In [6]:
# Model Accuracy, how often is the classifier correct?
print('Accuracy:', accuracy_score(Y_test, Y_pred))
print('Classification Report:')
print(classification_report(Y_test, Y_pred))

Accuracy: 0.9666666666666667
Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        11
           1       1.00      0.92      0.96        13
           2       0.86      1.00      0.92         6

    accuracy                           0.97        30
   macro avg       0.95      0.97      0.96        30
weighted avg       0.97      0.97      0.97        30

