New Parser for TU-Dortmund and FH-Suedwestfalen #88

gentges · 2018-11-09T11:22:06Z

I wrote a parser for the canteens that the Studierendenwerk Dortmund is working for. The parser can parse all canteens of the TU-Dortmund, FH Dortmund, ISM Dortmund, FH Suedwestfalen and FernUni Hagen.

gentges · 2018-11-12T19:27:22Z

Ok test failed, because my parser needs the python module "requests".
Is it possible to add the module or should I rewrite the parser using another module?

klemens · 2018-11-12T20:44:11Z

Just replace requests.get(url).text with urlopen(url).read().decode('utf-8') from urllib.request. For query parameters, you can use urllib.parse.urlencode or just manually write the string in this case. You can also change the parser from html.parser to lxml, which is faster and available by default.

gentges · 2018-11-15T10:09:47Z

Thanks for the advice @klemens !
I now use urllib, but with the lxml parser I got had some problems, so I stayed with the html.parser.
The Travis check fails, because of an issue with certificates during the apt update.

klemens · 2018-11-15T19:05:47Z

Thanks! If you want to rerun travis, just git commit --amend the the last commit and git push --force update the branch.

I will take a look at the changes this weekend.

klemens

Thanks again for writing this parser and sorry for the delay. This looks quite good already! The automatic legend extraction is not working currently, but this easy to fix. Apart from that, there are only some minor things that can be improved.

parsers/dortmund.py

klemens · 2018-12-09T14:42:53Z

parsers/dortmund.py

+            elif 'price'in item['class']:
+                price = item.text
+                if 'student' in item['class']:
+                    student_price = getAndFormatPrice(price)


I saw in your commits, that you removed the getAndFormatPrice function and then added it back. I just tested it myself and the convertPrice function in feed.py seems to properly extract all prices. Were there some problems with just passing the string directly?

Yeah, you are right I removed the function, but then I realised again why I put it in in the first place. There are some meals (I guess only side dishes), where there is no price given. The standard parsing of the prices breaks at that point, so I implemented my method to handle empty prices.

You are right. Seems like we should make convertPrice a little bit more flexible. 😅 Do you remember an example where there were no prices given?

parsers/dortmund.py

klemens · 2018-12-09T14:51:24Z

Thanks for the changes. This is almost ready to be merged.

After this has been merged, we have to add the parser and the contained canteens to the openmensa.org site. Do you want do this with your own account (you will get the emails if something breaks) or do you prefer if I add the parser to my account?

gentges · 2018-12-10T13:19:50Z

Sure, I can add the parser myself. There is a tutorial for that on the website, right?

klemens · 2018-12-11T21:12:32Z

I merged this (sqashed into one commit) into the master branch, which means that it should become available at http://omfeeds.devtation.de/dortmund/*location*/full.xml within a day or so (there was an error uploading the new built just now, I will take a look). After the feed is available, you can add the parser and the canteens to your openmensa.org account. There is no tutorial, but if you have any questions, you can ask me. 😉

When adding the canteens (including address etc.) make sure to use the same name as in the parser (like tu-hauptmensa) for the name of the "Quelle". After you have added all canteens, you can add the index url http://omfeeds.devtation.de/dortmund/index.json to the parser settings and use the "Update" button. This will create all "feeds" automatically, so you don't have to add them manually.

klemens · 2019-02-03T17:30:40Z

Unfortunately, the parser update is still broken: #89 However, as this is a new parser, I can offer you to host the parser at my own server for now.

klemens · 2019-02-04T22:39:15Z

Sorry for the long delay 😞, but I have good news. I am now hosting the parsers on my own server: https://omfeeds.crpt.de/dortmund/index.json

So feel free to create the parser on openmensa.org. If you have any questions, just ping me.

gentges · 2019-02-13T19:02:12Z

No worries, I've had a lot to do lately so the parser had a not so high priority for me.
But thank you for making it accessible now. I just added it to the website.

klemens · 2019-02-13T19:33:57Z

Looks good! The kostBar currently has a meal with an empty name, which is not allowed and leads to the following error:

Traceback (most recent call last):
  File "/data/service/openmensa-parsers/repo/wsgihandler.py", line 23, in handler
    content = parse(request, *(match.group('dirs').split('/') + [file]))
  File "/data/service/openmensa-parsers/repo/parse.py", line 10, in parse
    return parsers[parser_name].parse(request, *args)
  File "/data/service/openmensa-parsers/repo/utils.py", line 49, in parse
    return self.sources[source].parse(request, *args)
  File "/data/service/openmensa-parsers/repo/utils.py", line 212, in parse
    return self.handler(*self.args, today=feed == 'today.xml', **self.kwargs)
  File "/data/service/openmensa-parsers/repo/parsers/dortmund.py", line 67, in parse_url
    parse_day(canteen, soup, wDay)
  File "/data/service/openmensa-parsers/repo/parsers/dortmund.py", line 126, in parse_day
    canteen.addMeal(wdate, category, description, notes=supplies, prices={'student': student_price, 'employ>
  File "/data/service/openmensa-parsers/repo/pyopenmensa/feed.py", line 662, in addMeal
    notes or [], prices)
  File "/data/service/openmensa-parsers/repo/pyopenmensa/feed.py", line 394, in addMeal
    raise ValueError('Meal names must not be empty')
ValueError: Meal names must not be empty

We should just ignore meals with an empty name.

gentges added 12 commits November 8, 2018 11:58

added parser for tu-dortmund

a2c19c4

activated parser in dortmund

e9682bb

removed debug log

9de2c56

Added categories for ISM and added a fallback for missing categories

3cf00f5

added categories and canteens

4f1c08a

added fallback for meals without price

60a6b9d

changed name of fh mensa

6047ae4

deleted unneeded line of code and fixed url of mensa-max-ophuels-platz

447bb4a

fixed url of fsw-mensa

3f9f4fe

added category

a3c1e45

added category

4f30bf8

added category

254916c

gentges added 2 commits November 13, 2018 12:31

using urllib instead of requests

5f04fe6

fixed index error

4cb3f8c

changed parser back to html.parser

751271a

gentges force-pushed the master branch from 7e09429 to 751271a Compare November 16, 2018 16:37

klemens requested changes Dec 2, 2018

View reviewed changes

gentges added 8 commits December 3, 2018 12:14

deleted global on map categories in parse_day

296f103

moved legend parsing up

a745e32

strip keys for legend

d330cbf

replaced if statement in parse_day

26dde8c

removed function to parse prices

acb6c4c

added method define_category

d1409d2

fixed some errors and tested all canteens

b136569

changed regex for prices and return prices in cents

ec7d5b9

klemens requested changes Dec 9, 2018

View reviewed changes

gentges added 4 commits December 10, 2018 13:15

use setLegendData for setting legend

1dc806b

removed characters in strip

b4a4b83

strip value of legend

4a03427

fixed bracket mismatch

d35862a

klemens merged commit cedd712 into mswart:master Dec 11, 2018

gentges mentioned this pull request Feb 13, 2019

Dortmund: fix error when parsing meals with no name #94

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Parser for TU-Dortmund and FH-Suedwestfalen #88

New Parser for TU-Dortmund and FH-Suedwestfalen #88

gentges commented Nov 9, 2018

gentges commented Nov 12, 2018

klemens commented Nov 12, 2018

gentges commented Nov 15, 2018

klemens commented Nov 15, 2018

klemens left a comment

klemens Dec 9, 2018

gentges Dec 10, 2018

klemens Dec 11, 2018

klemens commented Dec 9, 2018

gentges commented Dec 10, 2018

klemens commented Dec 11, 2018

klemens commented Feb 3, 2019

klemens commented Feb 4, 2019

gentges commented Feb 13, 2019

klemens commented Feb 13, 2019

New Parser for TU-Dortmund and FH-Suedwestfalen #88

New Parser for TU-Dortmund and FH-Suedwestfalen #88

Conversation

gentges commented Nov 9, 2018

gentges commented Nov 12, 2018

klemens commented Nov 12, 2018

gentges commented Nov 15, 2018

klemens commented Nov 15, 2018

klemens left a comment

Choose a reason for hiding this comment

klemens Dec 9, 2018

Choose a reason for hiding this comment

gentges Dec 10, 2018

Choose a reason for hiding this comment

klemens Dec 11, 2018

Choose a reason for hiding this comment

klemens commented Dec 9, 2018

gentges commented Dec 10, 2018

klemens commented Dec 11, 2018

klemens commented Feb 3, 2019

klemens commented Feb 4, 2019

gentges commented Feb 13, 2019

klemens commented Feb 13, 2019