### 1. Input data

The data must be exported from Windographer.

Data must have gone through the process of QC, LT adjustment and gap filling.

To avoid errors, the labels in Windographer must follow the naming convention adopted by SolarGIS: GHI, DNI, DIF, flagR, SE, SA, TEMP, AP, RH, WS, WD and PWAT.

The columns "Type" in windographer must be: "Other" and the columns Units must be empty.

Finally we must export the data as a text file considering: the time stamp must be "Hourly",Date format: YYYYMMDD, Time format: HH:MM, Time stamps indicate: we must play with the different options to get the time stamp in the middle of the hour (ie 00:30).
    
Please follow this naming convention "YYYYMMDD-SiteName-DataSource-HourlyLTSeries.txt". This file must be uploaded the folder where this notebook is located.


    

### 2. Import data

The long term hourly time series is now imported to this notebook.

In [1]:
import pandas as pd
import datetime as dt
import csv




In [2]:
# insert site name
siteName='Irene'

# insert the neame of the windog export file between the red commas
windogFile = pd.read_csv('20201110-Irene-SolarGIS-HourlyLTSeries.txt', skiprows=12, sep='\t')

#delete Unnamed column
del windogFile['Unnamed: 13']

# add year and month columns
windogFile['Year']= pd.DatetimeIndex(windogFile['Date/Time']).year
windogFile['Month']= pd.DatetimeIndex(windogFile['Date/Time']).month

#function to identify leap years

def leap_year(year):
    if year%4 == 0:
        return True
    else:
        return False

#add leap Year tag column
windogFile['Leap']= windogFile['Year'].apply(leap_year)

#windogFile.head(15)

### Obtain GHI monthly sums and check for full months

In [3]:
firstSum=windogFile.groupby(['Year','Month']).sum()
firstSum=firstSum.drop(columns='Leap')
firstSum=firstSum.reset_index(level=[0,1])

#count the hours in a month
firstCount=windogFile.groupby(['Year','Month']).count()
firstCount=firstCount[firstCount.columns[:1]]
firstCount.columns=['Total hours']
#revert multi-Index to single index Data Frame
firstCount=firstCount.reset_index(level=[0,1])

# number of hours in a non leap year

nonLeapYear = pd.DataFrame({'Month':[1,2,3,4,5,6,7,8,9,10,11,12],
                           'Total hours':[744,672,744,720,744,720,744,744,720,744,720,744]})


#identify full months
def full_month(month, hours):
    if hours == nonLeapYear.loc[[month-1],'Total hours'].iloc[0]:
        return True
    else:
        return False
    
firstCount['Full month']= firstCount.apply(lambda x: full_month(x['Month'], x['Total hours']), axis=1)

firstSum['Total hours']=firstCount['Total hours']

firstSum['Full month']=firstCount['Full month']

### Obtain the GHI Mean of monthly means

In [4]:
noGapData=firstSum[firstSum['Full month']==True]
#MOMM GHI
firstAve=noGapData.groupby(['Month']).mean()
firstAve=firstAve.reset_index()
firstAve=firstAve[['Month','GHI']]
#MOMM DNI
secondAve=noGapData.groupby(['Month']).mean()
secondAve=secondAve.reset_index()
secondAve=secondAve[['Month','DNI']]

### Add mean GHI bias column

In [5]:
#GHI monthly bias
biasTable=noGapData[['Year','Month','GHI']]
biasTable['Bias']=abs(1-biasTable['GHI']/biasTable['Month'].map(firstAve.set_index('Month')['GHI']))
tmyYears=biasTable.sort_values('Bias').drop_duplicates('Month')
tmyYears=tmyYears.sort_values('Month')
tmyYears=tmyYears[['Month','Year']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until


### Add mean DNI bias column

In [6]:
#DNI monthly bias
biasTable2=noGapData[['Year','Month','DNI']]
biasTable2['Bias']=abs(1-biasTable2['DNI']/biasTable2['Month'].map(secondAve.set_index('Month')['DNI']))
#biasTable2.head(5)
tmyYears2=biasTable2.sort_values('Bias').drop_duplicates('Month')
tmyYears2=tmyYears2.sort_values('Month')
tmyYears2=tmyYears2[['Month','Year']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until


### Add combined bias column

The below values in green, can be modified to assign different weights to the GHI bias and the DNI bias. The sum of both values below must be <=1.

In [7]:
totalBias=biasTable[['Year','Month']]
totalBias['Total bias']=0.23*biasTable['Bias']+0.7*biasTable2['Bias']
#totalBias

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


### TMY year/month selection

In [8]:
tmyYears3=totalBias.sort_values('Total bias').drop_duplicates('Month')
tmyYears3=tmyYears3.sort_values('Month')
tmyYears3=tmyYears3[['Month','Year']]
tmyYears3=tmyYears3.set_index('Month')
tmyYears3

Unnamed: 0_level_0,Year
Month,Unnamed: 1_level_1
1,2014
2,2010
3,2001
4,2005
5,2006
6,2005
7,2001
8,1996
9,1998
10,2002


### TMY file generation

In [9]:
def tmy(month, year):
    if year == tmyYears3.loc[[month],'Year'].iloc[0]:
        return True
    else:
        return False
    
windogFile['TMY']= windogFile.apply(lambda x: tmy(x['Month'], x['Year']), axis=1)

#replace the real year by 1990, a standard non-leap year
tmy=windogFile.loc[windogFile['TMY'] == True]
tmy['Date/Time']= pd.DatetimeIndex(tmy['Date/Time'])
tmy["Date/Time"]=tmy["Date/Time"].map(lambda x: x.replace(year=1990))

tmy=tmy.sort_values(by=['Date/Time'])


#Creates empty csv file

fileName=siteName+'-TMY.csv'


#add available data to csv file

tmy.to_csv(fileName, mode='a', sep=';',index=False)

print('The file %s has been created in the folder that contains this notebook' %(fileName))

The file Irene-TMY.csv has been created in the folder that contains this notebook


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  # This is added back by InteractiveShellApp.init_path()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if sys.path[0] == '':
