# Solar Panel Dataset
## Company: EasyGreen
### Explanation of Variables in Dataset

totalUsePower: Total kWh (amount of electricity) used today.


totalProductPower: Total kWh (amount of electricity) produced by the solar panels today.


totalSelfUsePower: Total kWh (amount of electricity) of today's production used for home electricity consumption.


totalBuyPower: Total kWh (amount of electricity) bought from the grid today.

In [5]:
from sqlalchemy import create_engine
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from pyairtable import Api
pd.set_option('display.float_format', lambda x: '%.2f' % x)

# Connection details
username = 'introToML'
password = 'pddbb1c530308955bbd2c5aa59a647e30074fc7b7ebde93f626c481f4de93f7de'
host = 'ec2-63-32-137-56.eu-west-1.compute.amazonaws.com'
port = '5432'
dbname = 'dcsdgo449thll5'

# Connection string
connection_string = f'postgresql+psycopg2://{username}:{password}@{host}:{port}/{dbname}'

# Create the engine
engine = create_engine(connection_string)

# Specify your table name
table_name = 'power_usage'

# Read the table into a DataFrame
df = pd.read_sql_table(table_name, engine, schema='public')

In [6]:
addressData = pd.read_csv('addressData.csv')
# rename Kunde ID to user_id, Kundens navn to name, CPR adresse to address
addressData.rename(columns={'Kunde ID':'user_id', 'Kundens navn':'name', 'CPR adresse':'address'}, inplace=True)
addressData['user_id'] = addressData['user_id'].fillna(0).astype(int)

In [7]:
def clean_addresses(address):
    # Replace unwanted characters with a space or remove them
    address = address.str.replace(r'[@#*]', '', regex=True)  # Remove @, #, *
    address = address.str.replace(r'[\n\r]', ' ', regex=True)  # Replace newlines with space
    address = address.str.strip()  # Strip leading/trailing whitespace
    return address

# Apply cleaning function
addressData['address'] = clean_addresses(addressData['address'])

In [19]:
import requests
import time 

def geocode_address(api_key, address):
    base_url = "https://us1.locationiq.com/v1/search.php"
    params = {
        "key": api_key,
        "q": address,
        "format": "json"
    }
    response = requests.get(base_url, params=params)
    if response.status_code == 200:
        data = response.json()
        if len(data) > 0:
            lat = data[0]['lat']
            lon = data[0]['lon']
            print(f"Address: {address} -> Latitude: {lat}, Longitude: {lon}")
            time.sleep(1)
            return lat, lon
        else:
            print(f"Address: {address} -> No results found.")
            return None, None
        
    else:
        print(f"Address: {address} -> Failed to geocode with status code {response.status_code}.")
        time.sleep(1)
        return None, None    
        
def apply_geocoding(row):
    lat, lon = geocode_address("pk.fdf56d0c2c0bbfc3b4054a3a45c0bd72", row['address'])
    return pd.Series([lat, lon])

In [20]:
# Applying geocoding
addressData[['latitude', 'longitude']] = addressData.apply(apply_geocoding, axis=1)

addressData.head()

Address: Hårslevvej 16, Hvidovre 2650 -> Latitude: 55.62991, Longitude: 12.465855
Address: Neptun alle 6, 2770 Kastrup -> Latitude: 55.62424, Longitude: 12.623936
Address: Huginsvej 9 Solrød Strand, 2680 -> Latitude: 55.535841, Longitude: 12.22071
Address: Strandgaardsparken 158, 5600 -> Latitude: 55.060772, Longitude: 10.383692
Address: Nicolinevej 21, 4600 Køge -> Latitude: 55.405702, Longitude: 12.260524
Address: Rødhalsevej 25 4200 Slagelse -> Latitude: 55.418529, Longitude: 11.202092
Address: Bynkevej 4, 3360 Liseleje -> Latitude: 56.009588, Longitude: 11.967963
Address: Havrevænget 119, 4060 Kirke Såby -> Latitude: 55.648997, Longitude: 11.819396
Address: lodsvænget 8, 4874 Gedser -> Latitude: 54.572157, Longitude: 11.935365
Address: Tisvildevej 12, 3210 Vejby -> Latitude: 56.065978, Longitude: 12.136467
Address: Saturnvej 26 4200, Slagelse -> Latitude: 55.409938, Longitude: 11.396523
Address: Bygbjerg 9, 8530 Hjortshøj -> Latitude: 56.249566, Longitude: 10.263924
Address: vadum 

Unnamed: 0,name,user_id,address,latitude,longitude
0,Prashant Bishnoi\n\n\n,1607,"Hårslevvej 16, Hvidovre 2650",55.62991,12.465855
1,Cemal Koybasi,1821,"Neptun alle 6, 2770 Kastrup",55.62424,12.623936
2,Ali Al-Nawas,1614,"Huginsvej 9 Solrød Strand, 2680",55.535841,12.22071
3,Amir Barhagh,1835,"Strandgaardsparken 158, 5600",55.060772,10.383692
4,Daniel Schantz Clausen,758,"Nicolinevej 21, 4600 Køge",55.405702,12.260524


In [22]:
addressData.to_csv('addressData.csv', index=False)

In [23]:
# merge addressData and df on user_id

dfMerged = df.merge(addressData, on='user_id', how='left')

In [25]:
dfMerged

Unnamed: 0,user_id,usage_date,totalBuyPower,totalProductPower,totalUsePower,totalOnGridPower,night_usage,plantId,dailychargecapacity,dailydischargecapacity,totalSelfUsePower,name,address,latitude,longitude
0,909,2022-11-02,9.11,0.00,9.11,0.00,6.66,134553295,0.00,0.00,0.00,Rikke Liebman,"Hjortevænget 3, 3100 Hornbæk",56.084327,12.456106
1,909,2022-11-03,51.74,0.00,52.93,0.00,41.65,134553295,0.08,1.69,0.00,Rikke Liebman,"Hjortevænget 3, 3100 Hornbæk",56.084327,12.456106
2,909,2022-11-04,42.67,3.08,45.96,0.00,40.04,134553295,2.33,1.84,3.08,Rikke Liebman,"Hjortevænget 3, 3100 Hornbæk",56.084327,12.456106
3,909,2022-11-05,15.04,29.07,34.60,9.36,13.08,134553295,9.08,7.22,19.71,Rikke Liebman,"Hjortevænget 3, 3100 Hornbæk",56.084327,12.456106
4,909,2022-11-06,21.33,3.03,24.52,0.10,10.75,134553295,1.45,1.61,2.93,Rikke Liebman,"Hjortevænget 3, 3100 Hornbæk",56.084327,12.456106
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
394595,1807,2024-04-11,0.10,8.39,8.08,0.05,4.01,149488458,5.71,4.16,8.34,Raji Elayathamby,"Søndermarksvej 202 Billund, 7190",55.728342,9.10749
394596,1811,2024-04-11,22.32,3.43,29.49,0.01,6.50,148621742,0.71,3.48,3.42,Mogens Eggers,"Bredagervej 47, 3 tv 2770 Kastrup",55.635639,12.637176
394597,1816,2024-04-11,30.29,13.02,37.70,5.43,33.71,148621572,5.75,4.18,7.59,Marco Anker,"Firhøjvej 37, 4460 Snertinge",55.71937,11.379678
394598,1818,2024-04-11,11.28,9.42,24.36,0.00,17.64,148956704,6.68,8.83,9.42,Preben Majland,"Toften 19, 6650 Brørup",55.479497,9.005815
