-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Horse Finishing Times #6
Comments
Your 100% right, the times will be meaningless in some cases with horses eased down etc. Final times alone dont really mean much in European racing without the sectionals and even less so over jumps. Im a flat racing man so the jumps probably didnt enter my mind when I added the time calculation! I think you just have to use your judgement as to when they are useful and trustworthy on an individual basis. Not sure that I can do anything other than just record them. Appreciate the feedback and ideas. |
ta for the other thread character encoding fix. And yup re "just record them" Sectionals..a growing area in the uk. We live in hope if not expectation :) Other Ideas Ponder some future means to download by date range. A download by date range feature could perhaps form the backbone Perhaps even useful if future scrape debugging was needed for a weird page. Just a few brain storm ideas. |
PS line 146 int(year) < 2019 To permit 2019 data download is it is as simple as changing that to 2020 or does anything else need done? |
Yeah we are lagging behind the rest of the world when it comes to timing, probably by design, the bookies have more influence here than elsewhere. Scraping by date range would require a different method, both in getting the individual race urls, and in storing the data. The current method gets every race url from a given year at a given track with one request. To scrape by data range, every date would need to be scraped individually, which is easy enough but I opted for the more efficient method when making the tool as it was more about historical data. I will get the scrape by date working as a separate script and see where it goes as I think its a good idea. Im currently working on a different project and I havent thought about this in a while but you have given me some motivation for it. And yes, just increasing that will increase the valid year range. I have updated that to 2020. |
bookies with more influence..yup. They may face a degree of suffering as well from shifty bookie accountancy tricks which are easier to do on net profit than turnover. Racing as a loss leader etc Water under the bridge that one though. As for your motivation levels.. I suspect they will rise as your favored flat season approaches :) What you have got so far here is brilliant as is. |
This is not an issue at all just added comment.
I noted the interesting idea you included about calculating
each horses finishing time based on the available winner time
and a calculation that transforms lengths beaten into
seconds etc.
This could be the basis of something useful.
The trick for the punter may be when to ponder using it and when to ignore it.
Picture a long distance chase for example.
The last dregs of finishers won't be putting maximum effort into
finishing as fast as they can. Any time recorded
might be a dubious measure of the ability they may demonstrate in future.
Less dubious in the same race may be the times for the first three home.
Style of race may have impact as well.
A 5f flat sprint for example would have a lot less of
the "I will just plod along at the end with minimal effort" style of impact.
Interesting that you have bother to include such stuff. :)
Future research into the data may reveal when it is a decent metric to use and when it is not.
AND
I can envision two possible routes for this scraper stuff.
#1 - rpsraper.py you continue to add new stuff too such as the above.
#2 - rpscraper.py is more so focussed on pure scrape. There is only so much data on the page and once it grabs it all without fail it is deemed perfect and set in stone.
Datatransform.py is then a 2nd script that takes the raw scrape output and creates extra fields.
Decimal odds, weights in lbs, time calcs anything else custom calculation wise.
No right or wrong answers I guess.
The text was updated successfully, but these errors were encountered: