<a href="https://colab.research.google.com/github/guilhermelaviola/IntegrativePracticeInDataScience/blob/main/Class14.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Disruptive Solutions**
Disruptive solutions leverage new technologies or reconfigured processes to significantly transform markets by improving efficiency, quality, and user experience while lowering costs. Innovation management emphasizes the strategic use of resources to generate value. Pivotal drivers of these innovations include Data Science, Business Intelligence, and Machine Learning, with Data Science focused on large data analysis and Big Data solutions like Hadoop and Spark enhancing data processing. Machine Learning automates learning from historical data, showcased by applications like chatbots that improve customer interactions. While challenges exist, including costs and resistance to change, structured implementation is crucial for success, highlighting the ongoing impact of these innovations in areas like Machine Learning and IoT on future developments.

In [1]:
# Importing all the necessary libraries:
import pandas as pd
from pyspark.sql import SparkSession
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import pandas as pd
import random
import time

## **Analyzing Large Data (Pandas + Simple Aggregation)**

In [2]:
# Generating a large dataset of customer interactions:
data = {
    'customer_id': range(1, 11),
    'interaction_time': [12, 5, 3, 9, 15, 7, 8, 10, 4, 6],
    'satisfaction_score': [4, 5, 3, 4, 5, 2, 4, 5, 3, 4]
}
df = pd.DataFrame(data)

# Basic Data Science analysis:
avg_time = df['interaction_time'].mean()
avg_satisfaction = df['satisfaction_score'].mean()

print('Average Handling Time:', avg_time)
print('Average Satisfaction Score:', avg_satisfaction)

Average Handling Time: 7.9
Average Satisfaction Score: 3.9


## **Big Data Processing concept**

In [3]:
# Example of how Spark would process a much larger dataset distributed across clusters:
spark = SparkSession.builder.appName('BigDataExample').getOrCreate()

# Loading large dataset:
df = spark.read.csv('customer_logs.csv', header=True, inferSchema=True)

# Performing distributed computation:
result = df.groupBy('event_type').count()

result.show()

AnalysisException: [PATH_NOT_FOUND] Path does not exist: file:/content/customer_logs.csv.

## **Machine Learning example**

In [4]:
# Training a Model to Predict Satisfaction:
# Example dataset:
df = pd.DataFrame({
    'interaction_time': [12, 5, 3, 9, 15, 7, 8, 10, 4, 6],
    'issues_resolved': [2, 1, 1, 2, 3, 1, 2, 3, 1, 2],
    'satisfaction_score': [4, 5, 3, 4, 5, 2, 4, 5, 3, 4]
})

X = df[['interaction_time', 'issues_resolved']]
y = df['satisfaction_score']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = LinearRegression()
model.fit(X_train, y_train)

predictions = model.predict(X_test)

print('Predictions:', predictions)
print('MSE:', mean_squared_error(y_test, predictions))

Predictions: [4.02862986 4.2392638  3.92842536]
MSE: 1.4327878633272135


## **Chatbot Example with Machine Learning-Driven Interaction**

In [7]:
# Simple chatbot that uses keyword matching. It is not a real ML model, but can be used to illustrate.
def simple_chatbot(user_input):
    user_input = user_input.lower()

    if 'problem' in user_input:
        return 'I am sorry to hear that. Can you describe the issue?'
    elif 'help' in user_input:
        return 'Sure! How can I support you today?'
    elif 'thanks' in user_input:
        return 'You are welcome!'
    else:
        return 'I am here to assist you with your request.'

# Example interaction:
print(simple_chatbot('I need help with my account'))

Sure! How can I support you today?


## **IoT + Analytics example**

In [8]:
# Training a model to predict satisfaction:
def read_sensor():
    # Simulated IoT temperature sensor:
    return 20 + random.random() * 5

for _ in range(5):
    print('Temperature reading:', read_sensor())
    time.sleep(1)

Temperature reading: 24.733400262748535
Temperature reading: 23.769699737841414
Temperature reading: 20.926567252512978
Temperature reading: 22.948347180002404
Temperature reading: 20.340772716586116
