# ETL Public Remarks from Multi-Listing Service Data
## Input:
This notebook takes in a CSV file that contains the public remarks data from the MLS dataset.

## Processing:
The remarks data will be extracted and parsed for relavent words. Each word will be grouped by (? lat/lng, subdivision ?) with a count.

## Output:
A new CSV file will be created that can be uploaded in the RealLeads public_remarks table.

In [1]:
# Imports
import pandas as pd 
import numpy as np

import re


In [2]:
# This function will extract important words
def extract_words(remark): 
    print(f"Remarks: {remarks}")

def extract_remarks(remarks_file): 
    print(f"Remarks dataset: {remarks_file}")

    # Read in the public remarks data into a DataFrame
    remarks_data = pd.read_csv(remarks_file, low_memory=False)

    return remarks_data


In [3]:
# Create path to the RealLeads data folder and file containing the remarks data.
file_dir = '../../data'

# Full path and name of file to be processed.
remarks_file = f'{file_dir}/pub_remarks_orig.csv'


In [5]:
# Get the remarks data in the form of a DataFrame
pub_remarks_df = extract_remarks(remarks_file)

pub_remarks_df.columns


Remarks dataset: ../../data/pub_remarks_orig.csv


Index(['MLSNumber', 'Address', 'PublicRemarks'], dtype='object')

In [7]:
# Note how this remark has a link to a virtual tour
# Do we want to do anything special with this data?
pub_remarks_df['PublicRemarks'][0]


"Visit this home virtually: http://www.vht.com/434126322/IDXS - Welcome to this gorgeous home in Sherwood Park II!  This is a turnkey property, just ready for you to move right in.  Boasting one of the largest lots in the community, the exterior landscaping in both front and rear yards give this property wonderful curb appeal, plus a tranquil oasis to spend time relaxing and entertaining al fresco. On those cooler evenings you can move inside to the beautiful 3 season room and still enjoy the view of the beautiful landscaping!  The interior of this great home is equally inviting, there are hardwood floors throughout, with the exception of the family room and hallway where you will find ceramic wood plank tile which blends seamlessly with the hardwood. The oil furnace has been conveniently updated to gas as a back up to the electric heatpump which was replaced in 2015. All new interior doors, neutral paint throughout, upgraded electrical panel and additional insulation in the attic, thi

In [8]:
# Note how this remark has a link to a virtual tour
# Do we want to do anything special with this data?

# We could have a separate table with MLS#, lat/lng and Virtual Tour link
pub_remarks_df['PublicRemarks'][5]


"Visit this home virtually: http://www.vht.com/434062885/IDXS - Adorable 3 bedroom brick bungalow!  When you pull up to 3425 Cranston Avenue, you are greeted by a welcoming screened in porch.  As you step through the front door, you walk into the large living room with hardwood floors throughout.  Down the hall you'll find two bedroom and a full bathroom.  The back of the house has an eat-in kitchen and leads to the rear yard and detached one-car garage.  Upstairs is the oversized third bedroom with storage in the side closets and is currently being used as a sewing room.  The lower level with walk out bilco doors is unfinished with a full bathroom and plenty of space for storage or can be finished for an additional family room, etc.  Upgraded french drain system in 2018 and brand new roof in 2019.  Conveniently located to Prices Corner, Kirkwood Highway, 141, shopping and dining!"

In [9]:
pub_remarks_df.head(20)


Unnamed: 0,MLSNumber,Address,PublicRemarks
0,DENC518086,2615 Pecksniff Rd,Visit this home virtually: http://www.vht.com/...
1,DENC518982,4938 S Tupelo Turn,"3 bedroom, 1.5 bath townhome located in the he..."
2,DENC512992,15 Kristina Ct,"Location, Location, Location! This Woodmill to..."
3,DENC512104,3251 Champions Dr,"Move right into this 2 bedroom, 2.1 bath townh..."
4,DENC503480,3706 Lafayette St,This nicely maintained home is being sold to s...
5,DENC501458,3425 Cranston Ave,Visit this home virtually: http://www.vht.com/...
6,DENC495076,2409 E Parris Dr,3 bedroom - 1.1 bath brick ranch being sold in...
7,DENC506356,1802 Fenpor Ave,Waiting on signatures. WOW! This brick front ...
8,DENC524092,4309 Birch Cir,Welcome to 4309 Birch Cir in Birch Pointe. Thi...
9,DENC523094,5822 Pepper Ridge Ct,"An unbelievable, unique rear view from this on..."
