# Machine Learning

Machine learning is a subfield of artificial intelligence that focuses on the development of algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data without explicit programming. It encompasses various techniques, including supervised learning, unsupervised learning, and reinforcement learning, which are applied across numerous domains such as image recognition, natural language processing, and predictive analytics. As machine learning continues to evolve, it is increasingly integrated into everyday applications, enhancing user experiences and driving innovation across industries. As a result, understanding the fundamental concepts of machine learning is crucial for professionals in technology, data science, and related fields. In this notebook, we will explore key algorithms, their implementations, and evaluate their performance using practical examples.

# Machine Learning Vs Deep Learning

The difference between Machine Learning and Deep Learning is that Machine Learning is a subset of Deep Learning. In contrast, while Machine Learning relies on traditional algorithms to analyze and interpret data, Deep Learning utilizes neural networks with many layers to model complex patterns and relationships in large datasets. Additionally, Deep Learning often requires more data and computational power compared to traditional Machine Learning methods, making it suitable for tasks such as image and speech recognition. Furthermore, Machine Learning techniques, such as decision trees and support vector machines, can be more interpretable and easier to implement for smaller datasets, while Deep Learning excels in performance when sufficient data and resources are available. In summary, both Machine Learning and Deep Learning have their unique strengths and weaknesses, and the choice between them depends on the specific requirements of the task at hand, including data availability, complexity of the problem, and the need for model interpretability. Understanding these differences is crucial for practitioners to select the appropriate approach for their projects, ensuring optimal results and efficient resource utilization. Ultimately, both Machine Learning and Deep Learning are integral to the field of artificial intelligence, each providing valuable tools and methodologies to tackle a wide range of challenges in data analysis and predictive modeling.

# Machine Learning Vs Artificial Intelligence

The terms are often used interchangeably, but they represent different concepts in the field of computer science. Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on the development of algorithms that allow computers to learn from and make predictions based on data. AI, on the other hand, encompasses a broader range of technologies and methodologies, including rule-based systems, natural language processing, and robotics, aimed at simulating human intelligence. While ML relies on data-driven approaches to improve performance over time, AI includes both data-driven and knowledge-based approaches to solve complex problems. In summary, while all machine learning is artificial intelligence, not all artificial intelligence is machine learning. Understanding the distinction is crucial for grasping the capabilities and limitations of each technology. As the field continues to evolve, the integration of ML and AI is expected to drive significant advancements across various industries, leading to smarter systems and more efficient processes. As organizations adopt these technologies, it is essential to evaluate their specific needs and objectives to effectively leverage the strengths of both machine learning and artificial intelligence.

# Types of Machine Learning
1. Supervised Learning - Learning to predict outcomes based on labeled input data using labeled and unlabeled data using both labeled and unlabeled data to improve learning accuracy and improve generalization and improve model performance through the use of various techniques such as ensemble methods and feature engineering and model selection.
    - Classification: It is a method of categorizing data into predefined classes and predicting outcomes based on their features.
    - Regression: It is a method of predicting continuous outcomes based on input data. It is a method that predicts continuous outcomes based on input data, where the relationship between the input features and the output is modeled using mathematical equations.
2. Unsupervised Learning - Learning patterns from unlabeled data to discover hidden structures without predefined labels.
    - Association
    - Dimensionality Reduction
    - Clustering
    - Anomaly Detection
    - Recommendation Systems
    - Image Processing
    - Time Series Analysis
    - Natural Language Processing
    - Computer Vision
    - Speech Recognition
    - Natural Language Processing
    - Computer Vision
    - Speech Recognition
3. Reinforcement Learning - Learning through trial and error, where an agent interacts with an environment to maximize cumulative reward. It combines elements of both supervised and unsupervised learning to leverage labeled and unlabeled data for better model performance.
    - Q- Learning
    - Sarsa Learning
    - Deep Q-Learning
    - Deep Sarsa Learning
    - Deep Reinforcement 
    - Actor-Critic Methods
    - Proximal Policy Optimization
    - PPO
    - Trust Region Policy Optimization
    - Soft Actor-Critic
4. Semi-supervised Learning - Learning that combines a small amount of labeled data with a large amount of unlabeled data to improve learning accuracy. It leverages the strengths of both supervised and unsupervised learning to enhance model training and performance.
    - Self-training
    - Active 
    - Transfer Learning

# Features, Input and Labels

1. Features: It is also called as predictors or independent variables or input  variables that are used to predict the outcome of a model. It is often 'X' in machine learning. 
2. Labels: It is also called as dependent variables or output variables that are used to predict the outcome of a model. They represent the target values we aim to achieve through our predictions. It is often 'y' in machine learning.
3. Training Data: This consists of the features and labels used to train a machine learning model. It is the dataset from which the model learns to make predictions. It is also called the training set or input data.
4. Test Data: This consists of the features and labels used to test a machine learning model. It is the dataset from which the model learns to make predictions. It is also called the test set or validation set.

In [None]:
# step 1 - Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import LabelEncoder

### Boston dataset was removed from sklearn.datasets. because
1. Ethical Concerns: The dataset includes a feature called B, which measures the proportion of the population that is Black in a neighborhood. Using this feature in predictive models can lead to unethical and discriminatory practices.

2. Outdated Data: The dataset is based on data collected in the 1970s, making it outdated and less relevant for modern applications.

3. Encouraging Better Practices: By removing the dataset, the maintainers of scikit-learn aim to encourage users to adopt more modern and ethically sound datasets for machine learning tasks

In [None]:
import pandas as pd
import seaborn as sns

df = sns.load_dataset("iris")
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [None]:
x = df.drop(columns="species")
y = df["species"]

In [None]:
# data import
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier


iris = load_iris()
x = iris.data
y = iris.target

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=0.2, random_state=42
)
model = KNeighborsClassifier(n_neighbors=3)
model.fit(x_train, y_train)

In [38]:
prediction = model.predict([[1.2, 2.3, 3.4, 2.1]])
species_name = iris.target_names[prediction[0]]
print(species_name)

versicolor
