<a href="https://colab.research.google.com/github/guilhermelaviola/ApplicationOfDataScienceForBusiness/blob/main/Class06.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Machine Learning for Business**
In the modern business context, Machine Learning has emerged as an essential tool for optimizing processes, extracting insights from data, and generating competitive advantage, allowing systems to learn automatically, identify patterns, and make predictions. Its application involves different types of algorithms, such as supervised and unsupervised learning, chosen according to the problem to be solved, such as demand forecasting or fraud detection. Languages ​​and tools like Python, Pandas, NumPy, and Scikit-learn enable the construction, implementation, and evaluation of models, while best practices, such as dividing data into training and test sets, help avoid problems like overfitting. Model performance is evaluated by metrics appropriate to each type of task, and environments like Google Colab facilitate experimentation and collaboration. Therefore, by correctly applying these concepts and tools, companies can automate processes, support strategic decisions, and drive innovation in competitive markets.

In [5]:
# Importing all the necessary libraries and resources:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

## **Example: Machine Learning Algorithms in Business**
Linear regression works with correlation between variables, whether strong or weak. Let's create a fictitious database of sales over time, using NumPy to generate random data and Pandas to organize this data into a DataFrame.

In [7]:
# Generating fictional data:
np.random.seed(42)
months = np.arange(1, 25) # 24 meses
sales = np.random.normal(200, 50, 24) # Monthly sales with mean of 200 and standard deviation of 50

# Generating a DataFrame with the fictional data above:
df = pd.DataFrame({
    'Months': months,
    'Sales': sales
})

In [8]:
# Dividing the data between input (X) and output (y) variables:
X = df.drop('Sales', axis=1) # Let's suppose the collumn 'target' is the label
y = df['Sales']

# Dividing the data into training and test:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [9]:
# Creating the model:
model = LinearRegression()

# Training the model:
model.fit(X_train, y_train)

# Provisions of the test data:
y_pred = model.predict(X_test)

# Evaluating the model:
r2 = r2_score(y_test, y_pred)

# Printing the results:
print(f'R-squared: {r2}')
print(f'Predictions:', y_pred)

R-squared: 0.04754100037425035
Predictions: [206.83359373 185.53365471 228.13353275 180.20866995 198.8461166 ]
