We intend to predict solar energy based on data from six cities: Ahmedabad, Bhopal, Chennai, Delhi, Guwahati and Mumbai.
We recommend using Python 3.5 or above. The PyPI dependencies are noted in requirements.txt.
In addition, to run stationarity test in stats.py, statsmodels requires numpy+mkl for which the wheel can be downloaded here. This package is a replacement for numpy. So, previous versions of numpy should be uninstalled before installing numpy+mkl.
The requirements bottleneck
and numexpr
are optional and are
recommended by pandas for achieving calculation speedups.
We used NREL database which can be accessed here. To download, select 'Point' under 'Download Data' section and put a marker on required city in the map. We used data of all features from 2000-14.
In config.py, one can specify paths to data for different cities. We noted our results for pre-processing methods under output folder.
Run python data.py
to generate test, validation and train dataset from raw_data.
Run python linear_models.py
to get baseline results from linear regression/SVR model for DNI/DHI/GHI prediction.
Run python random_forest_model.py
to get Random-Forest results for DNI/DHI/GHI prediction.
Run python gradient_boosting_model.py
to get GBM results for DNI/DHI/GHI prediction.
Run python solar_city.py
(with appropriate arguments in the main function) to get baseline/GNB/LogisticRegression results for DNI/DHI/GHI prediction.