Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating english data to 2018 and cleaning #52

Merged
merged 3 commits into from
Feb 20, 2019
Merged

updating english data to 2018 and cleaning #52

merged 3 commits into from
Feb 20, 2019

Conversation

RobWHickman
Copy link
Contributor

Mostly thought I'd make a pull request to see how active the package is.

Updates the english data to up to the end of last season and does some basic munging of the league cup data.

Though it's able to merge I think there's some work I'd want to do first (most obviously not have the data in /rob_data of course). I was thinking of standardising columns across the datasets (see pen/pens/hp/vp between the FA cup and league cup) mainly and then also updating the biggest european leagues but was wondering if there's anything pressing I might have missed. Also keen to get insights into what data sources are good to use. I pretty much only used 11v11.com which is ok for the range needed but limited.

I've left in my munging scripts. They're written in very basic and bad form R but I think it makes them more readable. Doing this mostly as experiments run so haven't gone back and commented etc.

Contemplated properly adding functions for scraping data but also think its a bit dangerous and would rather just archive the static data? Though see above questions about data sources if there are any with APIs etc. that could be utlised.

Best,

@jalapic
Copy link
Owner

jalapic commented Feb 20, 2019

thanks Rob for all of this. I've merged it. Any future work would also be greatly appreciated! I don't have as much time as I'd like to keep this up to date.

@jalapic jalapic merged commit 93d7166 into jalapic:master Feb 20, 2019
@jalapic
Copy link
Owner

jalapic commented Feb 20, 2019

Also w.r.t data sources. Most data sources right now seem to use the same sources, so things like 11v11 are fine. There were more problems when I was putting together historical data - but mostly should be fine now. The main issue to keep track of is keeping team names consistent across Seasons and competitions.

@jalapic
Copy link
Owner

jalapic commented Feb 20, 2019

thanks for this - ultimately I will have to delete the 'munge' and 'robdata' folder as I will need to get it into a format that is CRAN acceptable. However, for now it works ok - especially if people want up to date data.

@RobWHickman
Copy link
Contributor Author

RobWHickman commented Feb 22, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants