Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WhoScored] Date format problem #126

Closed
CBatatinha opened this issue Dec 19, 2022 · 8 comments
Closed

[WhoScored] Date format problem #126

CBatatinha opened this issue Dec 19, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@CBatatinha
Copy link

Hello,

I'm trying to pull the schedule from any league, but it keeps getting an error in the date format. Even when I input the match ID, keeps with problem to read the data because of the date format. How can I solve it?
ValueError:
time data 'Jumatatu, Des 26 2022 12:30' does not match format '%A, %b %d %Y %H:%M'

@probberechts
Copy link
Owner

"Jumatatu" is apparently Swahili for "Monday". Swahili isn't even a supported language on WhoScored, so it is probably a bug in the website (which will resolve itself automatically) or a plugin in your browser which translates the dates automatically.

Which league / season are you trying to scrape?

@CBatatinha
Copy link
Author

Premier League 2022

@probberechts probberechts changed the title [WhoScored] Date Format problem [WhoScored] Date format problem Jan 8, 2023
@LuccaStochiero
Copy link

LuccaStochiero commented Feb 13, 2023

Hello,

I've got the same problem as well. Trying to pull any match from any league, the page automatically translate to swahili and the data format doesn't match. I even turned on my VPN to see if the problem is here in Brazil but nothing really change

@probberechts
Copy link
Owner

I have no issues on the main domain, but experience the same problem on the 1xbet subdomain. For example on https://1xbet.whoscored.com/Regions/252/Tournaments/2/England-Premier-League. It seems that WhoScored uses Swahili as the default locale, but I haven't managed to figure out how to force WhoScored to set the English locale.

One workaround I see is to create a fallback function that attempts to parse dates as Swahilian if parsing as an English date fails. One thing to keep in mind here is that most people will not have the Swahili ("sw_KE") locale on their system, so I think it is best to just create a dict with days of the week and months to create the mapping. If someone would like to implement this, please go ahead.

@LuccaStochiero
Copy link

Sorry for bother you again but i'm really a newbie in Python, more accustomed to R, do u know any place i can find a tutorial to make that dict?

@probberechts
Copy link
Owner

I'll see if I can implement this during the weekend. Currently not sure how to do it best either. I do not have experience with parsing non-English dates.

@guilherme-95
Copy link

One possible workaround is routing traffic through a country in which 1xbet is not allowed to operate, as that will keep you within the main domain

@probberechts
Copy link
Owner

probberechts commented Mar 1, 2023

One possible workaround is routing traffic through a country in which 1xbet is not allowed to operate, as that will keep you within the main domain

Ah, interesting. Such as Belgium apparently 😃 I can browse directly to 1xbet.whoscored.com, but did not know that it gets redirected in other countries.

Anyway, I think the fix that I implemented in a3bf31b is more straightforward. I only re-opened this issue because it looks like I made a small mistake (see ML-KULeuven/socceraction#474).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants