# Uvod v strojno ucenje

Viri:
- [AI For Everyone](https://www.coursera.org/learn/ai-for-everyone)
- [Artificial general intelligence (AGI)](https://www.techtarget.com/searchenterpriseai/definition/artificial-general-intelligence-AGI)
- [What is a Turing Test? A Brief History of the Turing Test and its Impact](https://www.youtube.com/watch?v=4VROUIAF2Do)
- [The history of deep learning](https://www.youtube.com/watch?v=mTtDfKgLm54)
- [History of AI](https://www.youtube.com/watch?v=EJt3_bFYKss)
- [Fairness-related harms in AI systems: Examples, assessment, and mitigation](https://www.youtube.com/watch?v=1RptHwfkx_k)
- [Responsible AI resources](https://www.microsoft.com/en-us/ai/responsible-ai-resources?activetab=pivot1:primaryr4&rtc=1)
- [Machine Learning for Everyone](https://vas3k.com/blog/machine_learning/)
- [Introduction to machine learning](https://learn.microsoft.com/sl-si/training/modules/introduction-to-machine-learning/)
- [AI Show - Deep Learning vs. Machine Learning](https://www.youtube.com/watch?v=lTd9RSxS9ZE)
- [Introduction to Machine Learning](https://www.youtube.com/watch?v=h0e2HAPTGF4)

## Teoretičen uvod v strojno učenje

Priložen PPT.

## Workflow of a machine learning project

**Data**

    

**Features and Target**


**Selecting your feature variable**


**Visualize your data**


**Split your dataset**


**Decide on a training method**


**Train a model**


**Evaluate the model**


**Parameter tuning**


**Prediction**


## What is machine learning?

Machine learning is about **extracting knowledge from data**. 




## What are machine learning models?

The model is the **core component of machine learning**, and ultimately **what we are trying to build**. 



## Why Machine Learning?

## Problems Machine Learning Can Solve

**Machine learning algorithms that learn from input/output pairs are called supervised
learning algorithms.** 

Examples of supervised machine learning tasks include:
- *Identifying the zip code from handwritten digits on an envelope*
- *Determining whether a tumor is benign based on a medical image*
- *Detecting fraudulent activity in credit card transactions*


**Unsupervised algorithms** are the other type of algorithm that we will cover in this
book. In unsupervised learning, **only the input data is known, and no known output
data is given to the algorithm**. 


Examples of unsupervised learning include:
- *Identifying topics in a set of blog posts*
- *Segmenting customers into groups with similar preferences*
- *Detecting abnormal access patterns to a website*

## scikit-learn

- scikit-learn is an **open source project**, meaning that it is free to use and distribute,
and anyone can easily obtain the source code to see what is going on behind the scenes.
- The scikit-learn project is constantly being developed and improved, and it
has a **very active user community**.
- It contains a number of **state-of-the-art machine learning algorithms**, as well as **comprehensive documentation** about each algorithm.
- scikit-learn is a very popular tool, and **the most prominent Python library for machine learning**. 
- It is widely used in industry and academia, and a wealth of tutorials and code snippets are available online. scikit-learn works well with a number of other scientific Python tools, which we will discuss later in this chapter.

> scikit-learn is built on top of
the NumPy and SciPy scientific Python libraries.

Spletna stran: https://scikit-learn.org/stable/

API reference: https://scikit-learn.org/stable/modules/classes.html

Primeri: https://scikit-learn.org/stable/auto_examples/index.html

Scikit-learn makes it straightforward to build models and evaluate them for use. It is primarily focused on **using numeric data** and contains several **ready-made datasets for use as learning tools**.

## A First Application: Classifying Iris Species

### The data

In [None]:
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets import load_iris

iris_dataset = load_iris()

In [None]:
iris_dataset.keys()

In [None]:
print(iris_dataset['DESCR'][:293])

In [None]:
iris_dataset['target_names']

In [None]:
iris_dataset['feature_names']

In [None]:
type(iris_dataset['data'])

In [None]:
iris_dataset['data'].shape

In [None]:
iris_dataset['data'][:5]

In [None]:
type(iris_dataset['target'])

In [None]:
iris_dataset['target'].shape

In [None]:
iris_dataset['target']

The meanings of the numbers are given by the `iris['target_names']` array:
- 0 means setosa
- 1 means versicolor
- 2 means virginica

### Training and Testing Data

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(iris_dataset['data'], iris_dataset['target'], random_state=0)

In [None]:
print(f"X_train shape: {X_train.shape}")
print(f"y_train shape: {y_train.shape}")

In [None]:
print(f"X_test shape: {X_test.shape}")
print(f"y_test shape: {y_test.shape}")

### Look at Your Data

In [None]:
# create dataframe from data in X_train
# label the columns using the strings in iris_dataset.feature_names
iris_dataframe = pd.DataFrame(X_train, columns=iris_dataset.feature_names)
iris_dataframe.head()

In [None]:
from matplotlib.colors import ListedColormap
from pandas.plotting import scatter_matrix

# create a scatter matrix from the dataframe, color by y_train
grr = scatter_matrix(iris_dataframe, 
                     c=y_train, 
                     figsize=(15, 15), 
                     marker='o', 
                     hist_kwds={'bins': 20}, 
                     s=60, 
                     alpha=.8, 
                     cmap=ListedColormap(['#0000aa', '#ff2020', '#50ff50']))
plt.show()

### Building Your First Model: k-Nearest Neighbors

In [None]:
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier(n_neighbors=1)

In [None]:
knn.fit(X_train, y_train)

### Making Predictions

In [None]:
X_new = np.array([[5, 2.9, 1, 0.2]])
print(f"X_new.shape: {X_new.shape}")

In [None]:
prediction = knn.predict(X_new)

print(f"Prediction: {prediction}")
print(f"Predicted target name: {iris_dataset['target_names'][prediction]}")

### Evaluating the Model

In [None]:
y_pred = knn.predict(X_test)

print(f"Test set predictions:\n {y_pred}")

In [None]:
print(f"Test set score: {np.mean(y_pred == y_test):.2f}")

In [None]:
knn.score(X_test, y_test)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(iris_dataset['data'], iris_dataset['target'], random_state=0)

knn = KNeighborsClassifier(n_neighbors=1)
knn.fit(X_train, y_train)

print(f"Test set score: {knn.score(X_test, y_test):.2f}")