This is a guide into other repositories and resources. Most of the code was created to explore ideas and was not prepared as a final product ready for publication. As time allows I turn selected pieces of code into publication form - jupyter notebooks, markdown files or html/pdf.
- Predicting housing prices (using data for Seattle, US)
- NEW Notebook containing the report - exploration, predictive models (e.g. random forest/sklearn, xgboost), a little introduction to statsmodels
- Lending Club Dataset - predicting loan default
- NEW Notebook investigating the default of loans - exploration and predictive models
- (soon) Notebook containing an XGBoost predictive model
- Repo with Python
- Leaf Classification Competition
- Titanic Competition
- Repo with R - mostly exploration of data, but also a model
- Repo in Python
- My Kaggle profile
- Python packages study
- Repo - contains solutions to exercises in numpy, pandas and a course in computational statistics (using scipy)
- Python extracts (example code for future reference)
- Introduction to Statistical Learning (a book)
- Repo - contains R code redoing examples, figures and lab sections from the book, solutions to exercises
- Analyses and solutions to exercises for data science books by Brian Caffo / JHU
- Advanced Statistical Methods - a course by IPI PAN (Institute of Computer Science, Polish Academy of Sciences)
- Repo - solutions to excercises for modules 1-4
- R extracts (example code for future reference)
- Repo of visualisation tools - mostly basic graphics and ggplot2
- Some projects completed within the Data Science specialisation courses by John Hopkins University at Coursera