In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [6]:
!pip install pandas openpyxl




### Housing Data Descriptions:

1. **Occupied Private Dwelling Type**: 
   - The kind of housing. It tells if a place is a house, apartment, duplex, etc. Different housing types often have different rent prices.

2. **Dwelling Record Type for Occupied Dwellings**: 
   - More specific details about the occupied houses. Might tell about the condition or status of the house.

3. **Non-private Dwelling Type for Occupied Non-private Dwellings**: 
   - Houses that are not private. Maybe places like shared hostels or dorms. These might have different rent prices than private ones.

4. **Number of Rooms & Number of Bedrooms**: 
   - How big the house is. More rooms or bedrooms usually mean a higher rent.

5. **Main Types of Heating & Fuel Types Used to Heat Dwellings**: 
   - How the house is warmed up. Some heating types might cost more or less. This can affect the rent price, especially in cold places.

6. **Dwelling Occupancy Status for All Dwellings**: 
   - Tells if a house is currently lived in or empty. An empty house might be priced differently.

7. **Access to Basic Amenities**: 
   - Important things a house has, like running water, electricity, or internet. Houses with more amenities might have higher rent.

8. **Dwelling Dampness & Mould Indicator**: 
   - Signs if a house has moisture or mold problems. Houses with these issues might have lower rent because they can be unhealthy.



In [29]:
url = 'https://raw.githubusercontent.com/robertoaltran/Population/main/2018-SA1-dataset-dwellings-AucklandRegion_updated_04-11-21.csv'
data1 = pd.read_csv(url)
data1.head()


Unnamed: 0,Statistical area 2 code (2018 areas),Statistical area 2 description,"2006 Census, occupied private dwelling type(1)(10)(16)",Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,"2013 Census, occupied private dwelling type(1)(10)(16)",Unnamed: 8,Unnamed: 9,...,Unnamed: 165,Unnamed: 166,Unnamed: 167,"2018 Census, dwelling mould indicator,(9)(13)(18)\nfor occupied private dwellings",Unnamed: 169,Unnamed: 170,Unnamed: 171,Unnamed: 172,Unnamed: 173,Unnamed: 174
0,Area_Code,Area_Description,Separate house,Joined dwelling,Other private dwelling,Private dwelling not further defined,Total,Separate house,Joined dwelling,Other private dwelling,...,Total stated,Not elsewhere included,Total,Mould over A4 size always,Mould over A4 size sometimes,Total mould(22),No mould / mould smaller than A4 size,Total stated,Not elsewhere included,Total
1,110200,Okahukura Peninsula,462,6,9,27,504,477,6,9,...,474,54,525,24,78,102,369,471,54,525
2,110300,Inlet Kaipara Harbour South,C,C,C,C,0,C,C,C,...,C,C,0,C,C,C,C,C,C,0
3,110400,Cape Rodney,918,24,21,60,1026,1014,24,15,...,1170,132,1305,54,168,222,957,1179,126,1305
4,110500,Wellsford,543,60,0,27,627,570,60,3,...,597,60,657,36,108,144,453,597,60,657



### Dataset Description for House Rental Price Prediction in New Zealand

**1. Total households in occupied private dwellings**:
   - This metric shows the number of homes currently being lived in. It can help us gauge housing demand in an area.

**2. Tenure of household, for households in occupied private dwellings**: 
   - This data indicates if homes are owned, rented, or under another type of tenure. High rental rates might show high mobility or high property purchase prices in an area, potentially affecting rental prices.

**3. Sector of landlord, for households in rented occupied private dwellings**: 
   - This specifies if the landlord is an individual, a company, or a government entity. For instance, government-subsidized rents might be priced differently than those from private landlords.

**4. Weekly rent paid by household, for households in rented occupied private dwellings**: 
   - A critical metric showing the weekly rent amounts paid by renters. This direct data will be a foundational piece for our rental price prediction model.

**5. Number of motor vehicles, for households in occupied private dwellings**: 
   - This data shows the number of cars per household. It might indirectly hint at a household's socioeconomic level or the accessibility of public transport in an area.

**6. Access to telecommunication systems, for households in occupied private dwellings**: 
   - This metric indicates if a household has access to telecommunication systems like high-speed internet. In areas where remote work is essential, this might influence rental prices.



In [30]:
url = 'https://raw.githubusercontent.com/robertoaltran/Population/main/2018-SA1-dataset-household-AucklandRegion.csv'
data2 = pd.read_csv(url)
data2.head()


Unnamed: 0,Statistical area 2 code (2018 areas),Statistical area 2 description,"Total households, \nin occupied private dwellings",Unnamed: 3,Unnamed: 4,"2006 Census, tenure of household,(1)(6)\nfor households in occupied private dwellings",Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 130,Unnamed: 131,"2018 Census, access to telecommunication systems,(5)(9)\nfor households in occupied private dwellings",Unnamed: 133,Unnamed: 134,Unnamed: 135,Unnamed: 136,Unnamed: 137,Unnamed: 138,Unnamed: 139
0,Area_Code,Area_Description,2006.0,2013.0,2018.0,Dwelling owned or partly owned,Dwelling held in a family trust,Total owned(11),Dwelling not owned and not held in a family trust,Total stated,...,Not Elsewhere Included,Total,No access to telecommunication systems,Access to a cellphone / mobile phone,Access to a telephone,Access to a fax machine(10),Access to the internet,Total stated,Not Elsewhere Included,Total
1,110200,Okahukura Peninsula,495.0,537.0,522.0,285,51,336,126,465,...,39,537,6,450,324,..,387,486,36,522
2,110300,Inlet Kaipara Harbour South,0.0,0.0,0.0,C,C,C,C,C,...,C,0,C,C,C,..,C,C,C,0
3,110400,Cape Rodney,1005.0,1155.0,1275.0,564,126,690,249,939,...,96,1155,9,1044,813,..,1035,1170,102,1275
4,110500,Wellsford,621.0,666.0,654.0,339,39,378,204,582,...,36,666,12,558,342,..,468,615,42,654



1. **Census usually resident population count**: 
   - The total number of people who live in an area most of the time.

2. **Census night population count**: 
   - The number of people in an area on a specific night when the census was taken.

3. **Unit record data source**: 
   - Specific details or records about individual people or things.

4. **Sex**: 
   - Whether a person is male or female.

5. **Age in five year groups**: 
   - Grouping people by age, like 0-5 years, 6-10 years, and so on.

6. **Age in broad groups**: 
   - Grouping people by age in larger groups, like children, adults, and seniors.

7. **Age in five year groups by sex**: 
   - Grouping people by age and whether they are male or female.

8. **Years at usual residence**: 
   - How many years a person has lived in their current home.

9. **Usual residence five years ago indicator**: 
   - Where a person was living five years ago.

10. **Usual residence one year ago indicator**: 
    - Where a person was living just one year ago.

11. **Birthplace**: 
    - The country or city where a person was born.

12. **Birthplace (broad geographic areas)**: 
    - The general area or region where a person was born.

13. **Years since arrival in New Zealand**: 
    - How many years it's been since a person came to live in


In [31]:
url = 'https://raw.githubusercontent.com/robertoaltran/Population/main/2018-SA1-dataset-individual-part-1-AucklandRegion_updated_28-7-20.csv'
data3 = pd.read_csv(url)
data3.head()

Unnamed: 0,Statistical area 2 code (2018 areas),Statistical area 2 description,Census usually resident population count(1)(14),Unnamed: 3,Unnamed: 4,Census night population count(2)(15),Unnamed: 6,Unnamed: 7,"2018 Census, unit record data source,\nfor the census usually resident population count(14)",Unnamed: 9,...,Unnamed: 425,Unnamed: 426,Unnamed: 427,Unnamed: 428,"2018 Census, Māori descent,(12)\nfor the census usually resident population count(14)",Unnamed: 430,Unnamed: 431,Unnamed: 432,Unnamed: 433,Unnamed: 434
0,Area_Code,Area_Description,2006 Census,2013 Census,2018 Census,2006 Census,2013 Census,2018 Census,2018 Census individual form,Individuals on the household listing only(16),...,Don't know,Total stated,Not elsewhere included,Total,Māori descent,No Māori descent,Don't know,Total stated,Not elsewhere included,Total
1,110200,Okahukura Peninsula,1380,1359,1491,1365,1353,1479,1248,39,...,45,1203,156,1359,339,1104,45,1491,0,1491
2,110300,Inlet Kaipara Harbour South,0,0,0,0,0,0,C,C,...,C,C,C,0,C,C,C,C,C,0
3,110400,Cape Rodney,2760,3096,3525,2745,3231,3561,2964,129,...,75,2682,411,3096,663,2736,126,3525,0,3525
4,110500,Wellsford,1671,1713,1929,1686,1713,1926,1551,87,...,45,1509,207,1713,513,1344,72,1929,0,1929



### Feature Relevance for House Rental Price Model:

When predicting house rental prices in New Zealand, it's essential to determine the relevance of potential features. Below is a brief overview:

1. **Religious Affiliation**: Gives insight into cultural diversity, but may not directly impact rental prices.
 
2. **Cigarette Smoking Behaviour**: Likely not directly relevant to rental prices.

3. **Health and Accessibility Metrics**: While these features (like difficulty in seeing or walking) show health trends, they may not have a direct relation to rental costs. 

4. **Relationship Status**: Provides insights into family demographics in the area.

5. **Individual Home Ownership**: Highly relevant. Areas with high homeownership might have different rental characteristics.

6. **Number of Children Born**: Can influence demand for property type and size.

7. **Education Metrics**: Locations near educational institutions or with a high student population might have specific rental demands.

8. **Income Metrics**: Areas with higher incomes might exhibit higher rental prices. The source of income can offer additional insights.

9. **Travel Means to Education**: Indicates transport infrastructure and proximity to educational establishments.



In [32]:
url = 'https://raw.githubusercontent.com/robertoaltran/Population/main/2018-SA1-dataset-individual-part-2-AucklandRegion.csv'
data4 = pd.read_csv(url)
data4.head()

Unnamed: 0,Statistical area 2 code (2018 areas),Statistical area 2 description,"2006 Census, religious affiliation (total responses)(1)(15), \nfor the census usually resident population count(14)",Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 339,Unnamed: 340,Unnamed: 341,Unnamed: 342,Unnamed: 343,Unnamed: 344,Unnamed: 345,Unnamed: 346,Unnamed: 347,Unnamed: 348
0,Area_Code,Area_Description,No religion,Buddhism,Christian,Hinduism,Islam,Judaism,"Māori religions, beliefs and philosophies",Spiritualism and New Age religions,...,Bicycle,Walk or jog,School bus,Public bus,Train,Ferry,Other,Total stated,Not elsewhere included,Total
1,110200,Okahukura Peninsula,540,3,618,0,0,3,24,3,...,0,0,12,0,0,0,0,54,0,54
2,110300,Inlet Kaipara Harbour South,C,C,C,C,C,C,C,C,...,C,C,C,C,C,C,C,C,C,0
3,110400,Cape Rodney,1143,18,1110,6,6,6,48,30,...,6,15,69,0,0,0,0,237,0,237
4,110500,Wellsford,501,6,837,15,6,3,99,6,...,3,66,288,3,0,0,3,630,0,630



**Work and Labour Force Status**:
- This shows how many people in an area have jobs. Places with more people working might have higher rent prices because more people need homes.

**Status in Employment**:
- This tells us what kind of jobs people have, like full-time or part-time. Areas with many full-time workers might have higher rents because they have a steady income.

**Occupation by Residence & Workplace Address**:
- This information reveals the types of jobs people do and where they work. Areas with lots of high-paying jobs, like doctors or lawyers, might have higher rents. Also, if people live close to their work, they might be willing to pay more for rent.

**Industry by Residence & Workplace Address**:
- This tells us about the businesses and industries where people work. For instance, areas close to big business hubs might have higher rents. On the other hand, places with seasonal jobs might have different rent patterns.


In [34]:
url = 'https://raw.githubusercontent.com/robertoaltran/Population/main/2018-SA1-dataset-individual-part-3a-AucklandRegion.csv'
data5 = pd.read_csv(url)
data5.head()

Unnamed: 0,Statistical area 2 code (2018 areas),Statistical area 2 description,"2006 Census, work and labour force status,(1)(8)\nfor the census usually resident population count aged 15 years and over(7)",Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,"2013 Census, work and labour force status,(1)(8)\nfor the census usually resident population count aged 15 years and over(7)",...,Unnamed: 232,Unnamed: 233,Unnamed: 234,Unnamed: 235,Unnamed: 236,Unnamed: 237,Unnamed: 238,Unnamed: 239,Unnamed: 240,Unnamed: 241
0,Area_Code,Area_Description,Employed Full time,Employed Part time,Unemployed,Not in the Labour Force,Total stated,Work and Labour Force Status Unidentifiable,Total,Employed Full time,...,Professional Scientific and Technical Services,Administrative and Support Services,Public Administration and Safety,Education and Training,Health Care and Social Assistance,Arts and Recreation Services,Other Services,Total stated,Not Elsewhere Included,Total
1,110200,Okahukura Peninsula,561,147,42,252,1002,39,1038,486,...,15,6,6,12,6,3,15,336,0,336
2,110300,Inlet Kaipara Harbour South,C,C,C,C,C,C,0,C,...,C,C,C,C,C,C,C,C,C,3
3,110400,Cape Rodney,1080,366,48,564,2055,90,2145,1023,...,93,30,0,60,54,69,27,1014,0,1014
4,110500,Wellsford,573,156,42,471,1242,21,1260,456,...,45,9,15,102,75,0,42,777,0,777



**Total Hours Worked in Employment per Week**:
- This tells us how long people work each week. If many people in an area work longer hours, they might have more income and could pay higher rents.

**Main Means of Travel to Work by Residence & Workplace Address**:
- This shows how people get to their jobs, like by car, bus, or walking. Areas with good public transport might have higher rents because it's easier for people to get around. On the other hand, places where most people drive might need homes with parking spaces.

**Unpaid Activities**:
- This might include things like volunteering or taking care of family. If many people in an area do lots of unpaid work, it could mean they have less income from paid jobs. This could influence how much they can afford for rent.



In [35]:
url = 'https://raw.githubusercontent.com/robertoaltran/Population/main/2018-SA1-dataset-individual-part-3b-AucklandRegion_updated_16-7-20.csv'
data6 = pd.read_csv(url)
data6.head()

Unnamed: 0,Statistical area 2 code (2018 areas),Statistical area 2 description,"2006 Census, total hours worked in employment per week,(1)(6)\n for the employed census usually resident population count aged 15 years and over(5)",Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,"2018 Census, unpaid activities,(3)(4)(13)\nfor the census usually resident population count aged 15 years and over(5)",Unnamed: 277,Unnamed: 278,Unnamed: 279,Unnamed: 280,Unnamed: 281,Unnamed: 282,Unnamed: 283,Unnamed: 284,Unnamed: 285
0,,,1-9 hours worked,10-19 hours worked,20-29 hours worked,30-39 hours worked,40-49 hours worked,50-59 hours worked,60 hours or more worked,Total stated,...,No activities,"Household work, cooking, repairs, gardening, ...",Looking after a child who is a member of own h...,Looking after a member of own household who is...,Looking after a child who does not live in own...,Helping someone who is ill or has a disability...,Other helping or voluntary work for or through...,Total stated,Not elsewhere included,Total
1,Area_Code,Area_Description,,,,,,,,,...,,,,,,,,,,
2,110200,Okahukura Peninsula,33,48,51,78,225,102,129,672,...,93,849,309,84,141,81,165,963,219,1182
3,110300,Inlet Kaipara Harbour South,C,C,C,C,C,C,C,C,...,C,C,C,C,C,C,C,C,C,0
4,110400,Cape Rodney,78,114,138,165,408,192,249,1344,...,192,2118,702,225,384,237,411,2355,483,2838


In [38]:
import urllib
url = 'https://catalogue.data.govt.nz/api/3/action/datastore_search?resource_id=89089be6-5165-4582-9a07-0e105126a4e2&limit=5&q=title:jones'  
data7 = pd.read_csv(url)
data7.head()

Unnamed: 0,"{""help"": ""https://catalogue.data.govt.nz/api/3/action/help_show?name=datastore_search""","""success"": true","""result"": {""include_total"": true","""limit"": 5","""q"": ""title:jones""","""records_format"": ""objects""","""resource_id"": ""89089be6-5165-4582-9a07-0e105126a4e2""","""total_estimation_threshold"": null","""records"": []","""fields"": [{""id"": ""_id""",...,"{""id"": ""LowestQuartile""","""type"": ""text""}.5","{""id"": ""HighestQuartile""","""type"": ""text""}.6","{""id"": ""rank""","""type"": ""float""}]","""_links"": {""start"": ""/api/3/action/datastore_search?resource_id=89089be6-5165-4582-9a07-0e105126a4e2&limit=5&q=title%3Ajones""","""next"": ""/api/3/action/datastore_search?resource_id=89089be6-5165-4582-9a07-0e105126a4e2&limit=5&q=title%3Ajones&offset=5""}","""total"": 0","""total_was_estimated"": false}}"
