updating english data to 2018 and cleaning #52
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Mostly thought I'd make a pull request to see how active the package is.
Updates the english data to up to the end of last season and does some basic munging of the league cup data.
Though it's able to merge I think there's some work I'd want to do first (most obviously not have the data in /rob_data of course). I was thinking of standardising columns across the datasets (see pen/pens/hp/vp between the FA cup and league cup) mainly and then also updating the biggest european leagues but was wondering if there's anything pressing I might have missed. Also keen to get insights into what data sources are good to use. I pretty much only used 11v11.com which is ok for the range needed but limited.
I've left in my munging scripts. They're written in very basic and bad form R but I think it makes them more readable. Doing this mostly as experiments run so haven't gone back and commented etc.
Contemplated properly adding functions for scraping data but also think its a bit dangerous and would rather just archive the static data? Though see above questions about data sources if there are any with APIs etc. that could be utlised.
Best,