# House Rental

## Subject

### Helping to evaluate vacation renting sites

__Short description__: This project is concentrated on the evaluating renting advertisements by studying the electricity consumption in towns where the property is advertised. In the context of users choosing different destinations for vacations, provide them an additional information on those places in terms of environmental conditions. The environmental condition in our limited example is based on the electrical consumption in the destination town.

__Further details__: In France, several house rental web sites have RSS XML flow that can be parsed into a data set containing the list of available rentals. You will find the names of the towns inside the text.

At the same time you have the CSV file from [ENEDIS](https://data.enedis.fr/explore/dataset/consommation-electrique-par-secteur-dactivite-commune/) containing the history of electricity consumption allowing you to estimation how much and for which purpose the energy is consumed. Thus you can provide every community with the “electrical” description, where you will calculate different indicators, such as for example, 
* part/amount of non-resident consumption, which might indicate the importance of the industrial installations in the town.
* evolution of the residencial consumption for several years, which might indicate the growth factor of the town
* evolution of the non-residential consumption
* other indicators left to you

Putting both data sources together allows you to sort/filter the rental advertisements by the “energy” indicators, as, for example, “zero industry” advertisements - quiet locations. To find an exact usage is left to you as a part of the exercise.

## Code

### Fetching results from RSS feed

In [11]:
from lxml import etree
import requests
import sys
import pandas as pd
import re

def print_titles():
    base = "http://www.ty-gites.com/rss/"
    location = "locations_vacances_"
    regions = ["alsace",
               "aquitaine",
               "auvergne",
               "bourgogne",
               "bretagne",
               "centre_val_de_loire",
               "champagne_ardenne",
               "corse",
               "franche_comte",
               "ile_de_france",
               "languedoc_roussillon",
               "limousin",
               "lorraine",
               "midi_pyrenees",
               "nord_pas_de_calais",
               "normandie",
               "pays_de_la_loire",
               "picardie",
               "poitou_charentes",
               "provence_alpes_cote_d_azur",
               "rhone_alpes",
               "outre_mer"]
    
    regex = r"-\s(\w|-)*\s-"
    for region in regions:
        url = base+location+region+".xml"
        root = etree.parse(url)
        for item in root.xpath("/rss/channel/item/title"):
            m = re.search(regex, item.text)
            if m:
                print(m.group(0)[1:-1].lower())
    return

print_titles()


 kogenheim 
 neubois 
 ebersheim 
 soultzeren 
 bergheim 
 saint-hippolyte 
 sondernach 
 sondernach 
 elbach 
 niederhaslach 
 dambach-la-ville 
 epfig 
 epfig 
 katzenthal 
 stosswihr 
 meistratzheim 
 ungersheim 
 mulhouse 
 ranspach 
 issenhausen 
 thénac 
 creysse 
 bergerac 
 varennes 
 thénac 
 trentels 
 saint-pierre-de-buzet 
 bardos 
 arsac 
 marquay 
 sanguinet 
 jumilhac-le-grand 
 sanguinet 
 sanguinet 
 sanguinet 
 sanguinet 
 sanguinet 
 lespielle 
 lespielle 
 domme 
 saint-maurice-de-lignon 
 jou-sous-monjou 
 roumégoux 
 roumégoux 
 roumégoux 
 saint-étienne-sur-usson 
 beauzac 
 blassac 
 arpajon-sur-cère 
 giou-de-mamou 
 massiac 
 anglards-de-salers 
 vichy 
 chomelix 
 badailhac 
 badailhac 
 fontanges 
 omps 
 olby 
 sennevoy-le-bas 
 montambert 
 montbellet 
 moutiers-en-puisaye 
 saint-fargeau 
 septfonds 
 sainte-vertu 
 saint-victor-sur-ouche 
 couches 
 goix 
 chissey-en-morvan 
 saint-martin-sous-montaigu 
 mervans 
 beaubery 
 alligny-en-morvan 
 labergeme

In [13]:
conso = pd.read_csv( "datasets/conso.csv" , delimiter = ";" )
print( "conso.size = {}".format( conso.shape ) )

conso_tab = pd.concat([conso])
conso_tab.head()

conso.size = (201963, 43)


Unnamed: 0,Année,Nom commune,Code commune,Nom EPCI,Code EPCI,Type EPCI,Nom département,Code département,Nom région,Code région,...,Résidences principales avant 1919,Résidences principales de 1919 à 1945,Résidences principales de 1946 à 1970,Résidences principales de 1971 à 1990,Résidences principales de 1991 à 2005,Résidences principales de 2006 à 2010,Résidences principales après 2011,Taux de chauffage électrique,Geo Shape,Geo Point 2D
0,2011,La Chapelle-Saint-Maurice,74060,CC de la Rive Gauche du Lac d'Annecy,247400732,CC,Haute-Savoie,74,Auvergne-Rhône-Alpes,84,...,36.363636,5.454545,7.272727,21.818182,12.727273,10.909091,5.454545,36.363636,,
1,2015,Thaumiers,18261,CC le Dunois,241800424,CC,Cher,18,Centre-Val de Loire,24,...,65.142857,12.571429,4.571429,7.428571,2.285714,6.285714,1.714286,17.714286,,
2,2013,Neuville-sur-Saône,69143,CU de Lyon,246900245,CU,Rhône,69,Auvergne-Rhône-Alpes,84,...,12.69379,6.374103,24.723784,35.05317,12.476466,8.678691,0.0,17.43724,,
3,2011,Lachapelle-sous-Rougemont,90058,CC du Pays Sous Vosgien,249000217,CC,Territoire-de-Belfort,90,Bourgogne-Franche-Comté,27,...,37.190072,6.611563,9.504127,12.809908,19.008253,13.636384,1.239693,28.51238,,
4,2016,Seyre,31546,CC Coteaux du Lauragais Sud (Co.Laur.Sud),243100179,CC,Haute-Garonne,31,Occitanie,76,...,23.404272,2.127638,12.766082,17.021358,14.89372,6.382914,23.404017,34.042461,"{""type"": ""Polygon"", ""coordinates"": [[[1.678708...","43.3635974739, 1.66366858427"


In [27]:
codeCommune = pd.DataFrame(columns=['Commune', 'Code Commune'])

codeCommune['Commune'] = conso["Nom commune"].values
codeCommune['Code Commune'] = conso["Code commune"]
codeCommune = codeCommune.drop_duplicates()

codeCommune.sort_values(['Commune']).head()



Unnamed: 0,Commune,Code Commune
9270,Aast,64001
57449,Abainville,55001
8401,Abancourt,59001
86115,Abancourt,60001
96027,Abaucourt,54001
