Group 4 

- Maisaa Alhulimi 
- Sara Alsadon  
- Amjad Abdullah  

# Project: Understand Customers

**Data exploration**: The data will be explored to understand the distribution of customer characteristics. This will include calculating the mean age of customers, the age range of female and male customers, the most common first name, and the least common last name.

**Benefits**:

The results of this project can be used by businesses to:

- Understand their customers better
- Develop more effective marketing strategies
- Target their products and services to the right customers
- Increase customer loyalty

**Tasks**:

Please read the `customers.csv` file to use as your dataset to explore the following:

1. the mean age of customers
1. the age range of female customers (min and max)
1. the age range of male customers (min and max)
1. the most common first name
1. the least common last name



In [2]:
import pandas as pd
from datetime import datetime

# Read the customers.csv file into a DataFrame
df = pd.read_csv('customers.csv')

# Calculate the mean age of customers
current_date = datetime.now()
df['date_of_birth'] = pd.to_datetime(df['date_of_birth'])  # Convert date_of_birth column to datetime
df['age'] = (current_date - df['date_of_birth']).astype('<m8[Y]')  # Calculate age based on current date

mean_age = df['age'].mean()
print(f"The mean age of customers is: {mean_age}")

# Filter the DataFrame for female customers and calculate the age range
female_customers = df[df['gender'] == 'F']
min_age_female = female_customers['age'].min()
max_age_female = female_customers['age'].max()
print(f"The age range of female customers is: {min_age_female} to {max_age_female}")

# Filter the DataFrame for male customers and calculate the age range
male_customers = df[df['gender'] == 'M']
min_age_male = male_customers['age'].min()
max_age_male = male_customers['age'].max()
print(f"The age range of male customers is: {min_age_male} to {max_age_male}")

# Find the most common first name
most_common_first_name = df['first_name'].value_counts().idxmax()
print(f"The most common first name is: {most_common_first_name}")

# Find the least common last name
least_common_last_name = df['last_name'].value_counts().idxmin()
print(f"The least common last name is: {least_common_last_name}")

The mean age of customers is: 27.56
The age range of female customers is: 18.0 to 46.0
The age range of male customers is: 12.0 to 40.0
The most common first name is: Roger
The least common last name is: Dukins


In [5]:
import os
from datetime import datetime

def calculate_age(date_of_birth):
    current_date = datetime.now()
    age = current_date.year - date_of_birth.year
    if (current_date.month, current_date.day) < (date_of_birth.month, date_of_birth.day):
        age -= 1
    return age

# Define the path to the customers.csv file
file_path = 'customers.csv'

# Read the CSV file and store the data in a list of dictionaries
data = []
with open(file_path, 'r') as file:
    lines = file.readlines()
    header = lines[0].strip().split(',')
    for line in lines[1:]:
        values = line.strip().split(',')
        record = dict(zip(header, values))
        data.append(record)

# Calculate the mean age of customers
current_date = datetime.now()
total_age = 0
valid_records = 0
for record in data:
    date_of_birth = datetime.strptime(record['date_of_birth'], '%Y-%m-%d')
    age = calculate_age(date_of_birth)
    if age >= 0:
        total_age += age
        valid_records += 1

mean_age = total_age / valid_records if valid_records > 0 else 0
print(f"The mean age of customers is: {mean_age}")

# Filter the records for female customers and calculate the age range
female_records = [record for record in data if record['gender'] == 'Female']
female_ages = [calculate_age(datetime.strptime(record['date_of_birth'], '%Y-%m-%d')) for record in female_records]
min_age_female = min(female_ages) if female_ages else 0
max_age_female = max(female_ages) if female_ages else 0
print(f"The age range of female customers is: {min_age_female} to {max_age_female}")

# Filter the records for male customers and calculate the age range
male_records = [record for record in data if record['gender'] == 'Male']
male_ages = [calculate_age(datetime.strptime(record['date_of_birth'], '%Y-%m-%d')) for record in male_records]
min_age_male = min(male_ages) if male_ages else 0
max_age_male = max(male_ages) if male_ages else 0
print(f"The age range of male customers is: {min_age_male} to {max_age_male}")

# Find the most common first name
first_names = [record['first_name'] for record in data]
most_common_first_name = max(set(first_names), key=first_names.count) if first_names else ''
print(f"The most common first name is: {most_common_first_name}")

# Find the least common last name
last_names = [record['last_name'] for record in data]
least_common_last_name = min(set(last_names), key=last_names.count) if last_names else ''
print(f"The least common last name is: {least_common_last_name}")

The mean age of customers is: 27.56
The age range of female customers is: 0 to 0
The age range of male customers is: 0 to 0
The most common first name is: Roger
The least common last name is: Stewart
