### Assignment: Data Analysis with Pre-Generated Random Data

This assignment guides you through analyzing a pre-generated dataset using Pandas. You'll practice manipulating and analyzing data to extract insights.

#### Setup: Generate the Dataset

Start by running the following code in this Jupyter Notebook to generate your dataset:


In [67]:
import pandas as pd
import numpy as np

# Seed for reproducibility
np.random.seed(0)

# Generate random data
data = {
    'Region': np.random.choice(['North', 'South', 'East', 'West'], size=100),
    'Sales': np.random.rand(100) * 1000,  # Sales figures between 0 and 1000
    'Transactions': np.random.randint(1, 100, size=100)  # Transactions between 1 and 100
}

# Create DataFrame
df = pd.DataFrame(data)

#### Task 1: Explore the Dataset

Familiarize yourself with the dataset structure and basic statistics.

1. **Display the first few rows of the dataset.** Insert your code below:


In [68]:
# INSERT CODE HERE to display the first 5 rows of the dataframe

print(df.head())

  Region       Sales  Transactions
0  North  570.196770            29
1   West  438.601513             3
2  South  988.373838            28
3  North  102.044811            84
4   West  208.876756            90


2. **Calculate and display basic statistics for the 'Sales' and 'Transactions' columns.** Insert your code below:


In [69]:
# INSERT CODE HERE to calculate basic statistics for 'Sales' and 'Transactions'

print(df[['Sales', 'Transactions']].describe())

            Sales  Transactions
count  100.000000    100.000000
mean   496.438899     48.440000
std    283.716158     28.051655
min      4.695476      1.000000
25%    262.365019     28.750000
50%    544.924754     44.500000
75%    700.581602     72.500000
max    998.847007     98.000000


#### Task 2: Data Analysis

Perform basic analyses to extract insights from the data.

1. **Calculate the average sales and transactions per region.** Insert your code below:


In [70]:
# INSERT CODE HERE to calculate the average sales per region
# INSERT CODE HERE to calculate the average transactions per region

print(df.groupby('Region')[['Sales', 'Transactions']].mean())

             Sales  Transactions
Region                          
East    564.093444     43.684211
North   515.117684     47.320000
South   466.730246     44.708333
West    463.957703     54.937500


2. **Find and display the region with the highest average sales.** Insert your code below:


In [71]:
# INSERT CODE HERE to find the region with the highest average sales

print(df.groupby('Region')['Sales'].mean().idxmax())

East


#### Task 3: Insights Reporting

Based on your analysis, answer the following questions in a markdown cell in your Jupyter Notebook:

1. What is the average sales and transactions figure per region?
2. Which region has the highest average sales, and what might this imply?
3. Did you notice any interesting patterns or anomalies in the data? How might you investigate these further?

**Placeholder for Student's Analysis:**


1. The average sales and transactions per region are as follows:

    | Region | Sales      | Transactions |
    |--------|------------|--------------|
    | East   | 564.093444 | 43.684211    |
    | North  | 515.117684 | 47.320000    |
    | South  | 466.730246 | 44.708333    |
    | West   | 463.957703 | 54.937500    |


2. The region with the highest average sales is East

3. One interesting pattern is the variation in sales figures across different regions. Some regions consistently show higher sales than others. To investigate further, we could:
    - Analyze the distribution of sales within each region
    - Look into external factors such as economic conditions, population density, or regional preferences
