Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Parser for TU-Dortmund and FH-Suedwestfalen #88

Merged
merged 27 commits into from
Dec 11, 2018

Conversation

gentges
Copy link
Contributor

@gentges gentges commented Nov 9, 2018

I wrote a parser for the canteens that the Studierendenwerk Dortmund is working for. The parser can parse all canteens of the TU-Dortmund, FH Dortmund, ISM Dortmund, FH Suedwestfalen and FernUni Hagen.

@gentges
Copy link
Contributor Author

gentges commented Nov 12, 2018

Ok test failed, because my parser needs the python module "requests".
Is it possible to add the module or should I rewrite the parser using another module?

@klemens
Copy link
Collaborator

klemens commented Nov 12, 2018

Just replace requests.get(url).text with urlopen(url).read().decode('utf-8') from urllib.request. For query parameters, you can use urllib.parse.urlencode or just manually write the string in this case. You can also change the parser from html.parser to lxml, which is faster and available by default.

@gentges
Copy link
Contributor Author

gentges commented Nov 15, 2018

Thanks for the advice @klemens !
I now use urllib, but with the lxml parser I got had some problems, so I stayed with the html.parser.
The Travis check fails, because of an issue with certificates during the apt update.

@klemens
Copy link
Collaborator

klemens commented Nov 15, 2018

Thanks! If you want to rerun travis, just git commit --amend the the last commit and git push --force update the branch.

I will take a look at the changes this weekend.

Copy link
Collaborator

@klemens klemens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for writing this parser and sorry for the delay. This looks quite good already! The automatic legend extraction is not working currently, but this easy to fix. Apart from that, there are only some minor things that can be improved.

parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
elif 'price'in item['class']:
price = item.text
if 'student' in item['class']:
student_price = getAndFormatPrice(price)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw in your commits, that you removed the getAndFormatPrice function and then added it back. I just tested it myself and the convertPrice function in feed.py seems to properly extract all prices. Were there some problems with just passing the string directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right I removed the function, but then I realised again why I put it in in the first place. There are some meals (I guess only side dishes), where there is no price given. The standard parsing of the prices breaks at that point, so I implemented my method to handle empty prices.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Seems like we should make convertPrice a little bit more flexible. 😅 Do you remember an example where there were no prices given?

parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
parsers/dortmund.py Outdated Show resolved Hide resolved
@klemens
Copy link
Collaborator

klemens commented Dec 9, 2018

Thanks for the changes. This is almost ready to be merged.

After this has been merged, we have to add the parser and the contained canteens to the openmensa.org site. Do you want do this with your own account (you will get the emails if something breaks) or do you prefer if I add the parser to my account?

@gentges
Copy link
Contributor Author

gentges commented Dec 10, 2018

Sure, I can add the parser myself. There is a tutorial for that on the website, right?

@klemens klemens merged commit cedd712 into mswart:master Dec 11, 2018
@klemens
Copy link
Collaborator

klemens commented Dec 11, 2018

I merged this (sqashed into one commit) into the master branch, which means that it should become available at http://omfeeds.devtation.de/dortmund/*location*/full.xml within a day or so (there was an error uploading the new built just now, I will take a look). After the feed is available, you can add the parser and the canteens to your openmensa.org account. There is no tutorial, but if you have any questions, you can ask me. 😉

When adding the canteens (including address etc.) make sure to use the same name as in the parser (like tu-hauptmensa) for the name of the "Quelle". After you have added all canteens, you can add the index url http://omfeeds.devtation.de/dortmund/index.json to the parser settings and use the "Update" button. This will create all "feeds" automatically, so you don't have to add them manually.

@klemens
Copy link
Collaborator

klemens commented Feb 3, 2019

Unfortunately, the parser update is still broken: #89 However, as this is a new parser, I can offer you to host the parser at my own server for now.

@klemens
Copy link
Collaborator

klemens commented Feb 4, 2019

Sorry for the long delay 😞, but I have good news. I am now hosting the parsers on my own server: https://omfeeds.crpt.de/dortmund/index.json

So feel free to create the parser on openmensa.org. If you have any questions, just ping me.

@gentges
Copy link
Contributor Author

gentges commented Feb 13, 2019

No worries, I've had a lot to do lately so the parser had a not so high priority for me.
But thank you for making it accessible now. I just added it to the website.

@klemens
Copy link
Collaborator

klemens commented Feb 13, 2019

Looks good! The kostBar currently has a meal with an empty name, which is not allowed and leads to the following error:

Traceback (most recent call last):
  File "/data/service/openmensa-parsers/repo/wsgihandler.py", line 23, in handler
    content = parse(request, *(match.group('dirs').split('/') + [file]))
  File "/data/service/openmensa-parsers/repo/parse.py", line 10, in parse
    return parsers[parser_name].parse(request, *args)
  File "/data/service/openmensa-parsers/repo/utils.py", line 49, in parse
    return self.sources[source].parse(request, *args)
  File "/data/service/openmensa-parsers/repo/utils.py", line 212, in parse
    return self.handler(*self.args, today=feed == 'today.xml', **self.kwargs)
  File "/data/service/openmensa-parsers/repo/parsers/dortmund.py", line 67, in parse_url
    parse_day(canteen, soup, wDay)
  File "/data/service/openmensa-parsers/repo/parsers/dortmund.py", line 126, in parse_day
    canteen.addMeal(wdate, category, description, notes=supplies, prices={'student': student_price, 'employ>
  File "/data/service/openmensa-parsers/repo/pyopenmensa/feed.py", line 662, in addMeal
    notes or [], prices)
  File "/data/service/openmensa-parsers/repo/pyopenmensa/feed.py", line 394, in addMeal
    raise ValueError('Meal names must not be empty')
ValueError: Meal names must not be empty

We should just ignore meals with an empty name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants