### First we will create a syntetic data for the business problem

**To simulate this scenario, we can generate a dataset with the following columns:**

- Date: Date for the time series data (monthly data from January 2010 to December 2023).
- City: Different city locations in Ontario where the retirement houses are located.
- Property_Price: Average property price in that city.
- Occupancy_Rate: Average occupancy rate of retirement houses in each city (percentage).
- Inflation_Rate: Monthly inflation rate (impact on the property market).
- Population_Age_65+: Population above 65 years old in the city (potential customers).

In [13]:
import pandas as pd
import numpy as np

# Parameters for the dataset
np.random.seed(42)
dates = pd.date_range(start='2010-01-01', end='2023-12-01', freq='MS')
cities = ['Toronto', 'Ottawa', 'Mississauga', 'Brampton', 'Hamilton', 'London', 'Markham', 'Vaughan', 'Kitchener', 'Windsor']

# Mean and standard deviation for property prices (600k to 1.2M)
mean_price = 900000
std_dev_price = 100000

# Create an empty dataframe
data = []

# Populate the dataset
for city in cities:
    for date in dates:
        property_price = round(np.random.normal(mean_price, std_dev_price))  # Normally distributed property prices
        # Ensure that prices stay within bounds of 600K to 1.2M
        property_price = max(600000, min(1200000, property_price))
        
        occupancy_rate = round(np.random.uniform(70, 95))  # Simulated occupancy rate percentage
        inflation_rate = round(np.random.uniform(1, 4),2)  # Inflation rate
        population_age_65 = round(np.random.normal(50000, 10000))  # Population aged 65+
        population_age_65_growth = round(np.random.uniform(0.5, 3.5),2)  # Population growth rate of 65+
        median_age = round(np.random.uniform(40, 50))  # Median age of the city
        income_level = round(np.random.normal(60000, 15000))  # Average income level for seniors
        proximity_healthcare = round(np.random.uniform(1, 10))  # Distance to nearest healthcare facility in km
        walkability_score = round(np.random.uniform(50, 90))  # Walkability score
        crime_rate = round(np.random.uniform(1, 10))  # Crime rate per 1000 people
        property_tax_rate = round(np.random.uniform(0.5, 2.5))  # Property tax rate percentage
        public_transport_access = np.random.choice([0, 1], p=[0.3, 0.7])  # Public transport access (1 or 0)
        government_incentives = np.random.choice([0, 1], p=[0.4, 0.6])  # Government incentives (1 or 0)
        environmental_quality = np.random.uniform(60, 100)  # Environmental quality score (0-100)
        senior_living_vacancy_rate = np.random.uniform(5, 15)  # Vacancy rate in senior living facilities
        mental_health_programs = np.random.choice([0, 1], p=[0.4, 0.6])  # Mental health programs (1 or 0)
        memory_care_availability = np.random.choice([0, 1], p=[0.3, 0.7])  # Memory care availability (1 or 0)

        data.append([date, city, property_price, occupancy_rate, inflation_rate, population_age_65, population_age_65_growth,
                     median_age, income_level, proximity_healthcare, walkability_score, crime_rate, property_tax_rate,
                     public_transport_access, government_incentives, environmental_quality, senior_living_vacancy_rate,
                     mental_health_programs, memory_care_availability])

# Create the DataFrame
columns = ['Date', 'City', 'Property_Price', 'Occupancy_Rate', 'Inflation_Rate', 'Population_Age_65+', 
           'Population_Age_65_Growth', 'Median_Age', 'Income_Level', 'Proximity_to_Healthcare', 'Walkability_Score', 
           'Crime_Rate', 'Property_Tax_Rate', 'Public_Transport_Access', 'Government_Incentives', 
           'Environmental_Quality', 'Senior_Living_Vacancy_Rate', 'Mental_Health_Programs', 'Memory_Care_Availability']

df = pd.DataFrame(data, columns=columns)

# Save the dataset
df.to_csv('../Data/ontario_retirement_house_investment_data.csv', index=False)

print("Dataset created successfully with features.")


Dataset created successfully with features.
