<a href="https://colab.research.google.com/github/guilhermelaviola/IntegrativePracticeInDataScience/blob/main/Class01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Data Science in Organizations**
Data Science is crucial for organizations, enabling effective decision-making by transforming data into valuable insights. Emphasizing a data-driven culture, it highlights the significance of data mining, which extracts knowledge through statistical techniques and machine learning. This knowledge feeds into various information systems: Transactional Information Systems (TIS) record daily transactions; Management Information Systems (MIS) create reports; Decision Support Systems (DSS) simulate scenarios for strategic decisions; and Executive Information Systems (EIS) provide executives with performance summaries. Beyond techniques and tools, organizations must invest in infrastructure, technology, and trained professionals to leverage Data Science effectively, resulting in improved decision-making, process optimization, and new opportunities.

In [1]:
# Importing all the necessary libraries and resources:
import pandas as pd
from sklearn.cluster import KMeans
import numpy as np

### **Transactional Information System (TIS)**

In [2]:
# Simulating daily transaction data recorded in an organization:
data = {
    'customer_id': [1, 2, 3, 4, 5, 6],
    'amount_spent': [120, 340, 560, 80, 220, 510],
    'visits': [3, 7, 10, 2, 4, 9]
}

transactions = pd.DataFrame(data)
print('=== Transaction Data (TIS) ===')
print(transactions)

=== Transaction Data (TIS) ===
   customer_id  amount_spent  visits
0            1           120       3
1            2           340       7
2            3           560      10
3            4            80       2
4            5           220       4
5            6           510       9


### **Data Mining (Statistical + Machine Learning)**

In [3]:
# Using K-Means clustering to identify customer groups:
kmeans = KMeans(n_clusters=2, random_state=42)
transactions['cluster'] = kmeans.fit_predict(transactions[['amount_spent', 'visits']])

print('\n=== Data Mining: Customer Segments ===')
print(transactions)


=== Data Mining: Customer Segments ===
   customer_id  amount_spent  visits  cluster
0            1           120       3        1
1            2           340       7        1
2            3           560      10        0
3            4            80       2        1
4            5           220       4        1
5            6           510       9        0


### **Management Information System (MIS)**

In [4]:
# Generating simple operational reports:
report = transactions.groupby('cluster').agg({
    'amount_spent': 'mean',
    'visits': 'mean'
}).rename(columns={'amount_spent': 'avg_spent', 'visits': 'avg_visits'})

print('\n=== MIS Report: Average Metrics by Customer Group ===')
print(report)


=== MIS Report: Average Metrics by Customer Group ===
         avg_spent  avg_visits
cluster                       
0            535.0         9.5
1            190.0         4.0


### **Decision Support System (DSS)**

In [5]:
# Simulating scenarios: What if customer visits increase by 20%:
scenario = transactions.copy()
scenario['visits_new'] = scenario['visits'] * 1.20
scenario['projected_revenue'] = scenario['visits_new'] * (scenario['amount_spent'] / scenario['visits'])

print('\n=== DSS Scenario: Projected Revenue with 20% More Visits ===')
print(scenario[['customer_id', 'projected_revenue']])


=== DSS Scenario: Projected Revenue with 20% More Visits ===
   customer_id  projected_revenue
0            1              144.0
1            2              408.0
2            3              672.0
3            4               96.0
4            5              264.0
5            6              612.0


### **Executive Information System (EIS)**

In [6]:
# High-level summary for executives:
total_revenue = transactions['amount_spent'].sum()
customer_count = transactions['customer_id'].nunique()
high_value_count = len(transactions[transactions['cluster'] == 1])

print('\n=== EIS Summary ===')
print(f'Total Revenue: ${total_revenue}')
print(f'Total Customers: {customer_count}')
print(f'High-Value Segment: {high_value_count} customers')


=== EIS Summary ===
Total Revenue: $1830
Total Customers: 6
High-Value Segment: 4 customers
