-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Parser for most of the Mensas from Studentenwerk Thueringen #79
Conversation
…except those who don't seem to publish online e.g. small cafeterias). Therefore I renamed ilmenau.py to thueringen.py and change the url scheme to .../thueringen/<CityPrefix><MensaShortname>/... This added support for Mensas in: - Eisenach: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/eisenach/index.html - Erfurt: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/erfurt/index.html - Gera: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/gera/index.html - Ilmenau: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/ilmenau/index.html - Jena: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/jena/index.html - Nordhausen: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/nordhausen/index.html - Schmalkalden: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/schmalkalden/index.html - Weimar: http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/weimar/index.html I also added one test url per city
Hey, thanks for your PR!
I'm not fully sure what you mean, and I understand your code enough to verify if you did it correctly. OpenMensa keeps a history of all parsed days. I assume you mean when today's menu (say Wednesday, 2018-05-02) contains entries for the past week or the day before (2018-04-23, or 2018-05-01), then the correct behavior is to just output them as well. So the feed will simply contain all the days from the menu, and OpenMensa will update the "old" days accordingly. I will just quickly go through your code and point out a few general things that you could improve. ;) |
Hi Y0hy0h, what I mean by past days as closed is the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some very simple things you can do to break down your code, make it more readable. Also you make it more robust and debuggable by improving your exception handling.
parsers/thueringen.py
Outdated
closed_date = mensa_start_date + timedelta(days=i-2) | ||
canteen.setDayClosed(closed_date.date()) | ||
except Exception: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dangerous, because you're swallowing all exceptions. Prefer making the try
block as short as possible and only except
ing exactly the errors you are interested in. Or even better, replace it with a clear check (using e. g. if). That way it becomes clear what you are checking, and you do not accidentally swallow other exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworked that part to get rid of the long and nested try
block.
Thanks for your recommendations 👍
parsers/thueringen.py
Outdated
from datetime import datetime, timedelta | ||
from pyopenmensa.feed import LazyBuilder | ||
|
||
calendar_week_regex = re.compile('\d{4}-\d{1,2}$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer declaring variables close to where they are used. Readability trumps performance, since you can always optimize readable code, when you've determined that the performance is important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the variables to the functions they are used at except one which is used at multiple places
parsers/thueringen.py
Outdated
for fee_candidate in fees: | ||
fee_string = fees_regex.findall(fee_candidate.text) | ||
for s in fee_string: | ||
amount_strings = amount_regex.findall(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is deeply nested. Prefer extracting a function and name it after what happens here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted and split the parsing into 3 different functions
parsers/thueringen.py
Outdated
current_date = mensa_start_date + timedelta(days=day_id-2) | ||
meals_data = day.find_all('tr') | ||
for meal_data in meals_data: | ||
meals = meal_data.find_all('td') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: Extract a function. Small functions are awesome, because you can quickly understand them, whereas such a big function is difficult to read.
Anywhere you can reasonably extract a function, do it. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted and split the parsing into 3 different functions
Changed default behaviour of non-available days in the past so that they are ignored instead of treated as closed. Removed try-catch clause
…ed (could happen at some canteens). Added error handling if the start date of a canteens current week could not be parsed
I took care of the changes you requested. |
Yes, that's what I would do. ;) |
This is already changed in the commits that also addressed the changes you requested. |
Sorry, I forgot to say that I'd like to wait for @klemens's (or another maintainer's) approval. I'm quite new to this repo so I have no experience merging new parsers. @klemens hinted that he might take some time to do that in a while. I hope it's ok for you to wait until that happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for writing this parser!
I just had a little bit of time and so this is only a first round of review. I have not yet looked at the parse_fees
and parse_meals
functions.
There are some problems that need to be fixed and I also left a couple of improvement suggestions. Also python doesn't mind if you put an empty line here and there and it will improve readability quite a lot. 😉
build_scripts/test-urls.txt
Outdated
http://localhost:9090/thueringen/je-zeiss/full.xml | ||
http://localhost:9090/thueringen/nh-mensa/full.xml | ||
http://localhost:9090/thueringen/sk-mensa/full.xml | ||
http://localhost:9090/thueringen/we-horn/full.xml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only testing one should be enough.
parsers/thueringen.py
Outdated
fees = parse_fees(document) | ||
if len(fees) is 2: | ||
employees_fee = fees[0] | ||
guests_fee = fees[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just define the defaults in the parse_fees
function or as "constants" in the global scope. Also len(fees)
is always 2 and you should check using ==
instead of is
(see below).
parsers/thueringen.py
Outdated
def parse_start_date(document): | ||
week_start_end_date_regex = re.compile('\d{1,2}\.\d{1,2}\.\d{4}') | ||
calendar_week_regex = re.compile('\d{4}-\d{1,2}$') | ||
options = document.find_all('option', value=calendar_week_regex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The select box has name="selWeek"
, so you don't need to search the whole document.
parsers/thueringen.py
Outdated
calendar_week_regex = re.compile('\d{4}-\d{1,2}$') | ||
options = document.find_all('option', value=calendar_week_regex) | ||
assert len(options) > 0 | ||
options_filtered = list(filter(lambda x: x.has_attr('selected'), options)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BS can search for multiple attributes at once:
doc.find("option", value=calendar_week_regex, selected=True)
You could also remove the regex if you just generate the correct value for the current week directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, I am fairly new to BS and had my problems with that one ;-)
parsers/thueringen.py
Outdated
try: | ||
mensa_start_date = parse_start_date(document) | ||
except AssertionError: | ||
return canteen.toXMLFeed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this happen on weekends? I would prefer just returning None when the error is expected (like on a weekend) and otherwise just let the exceptions terminate the parser, because otherwise there won't be any notification if the website changes and the parsers fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your feedback above, I changed the parse_start_date function so it should not throw an exception (unless something goes really wrong and as you stated in that case we want the parser to fail).
As a result the parse_date function is also a lot shorter than before :)
parsers/thueringen.py
Outdated
else: | ||
canteen.addMeal(current_date.date(), 'Hauptgerichte', main_dish, notes, costs, | ||
None) | ||
for i in range(min(days_open), 9): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail if days_open
is empty. Also i
should be day_id
like above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified this line into for day_id in range(min(days_open or [9]), 9):
so if days_open is empty the loop will be skipped
parsers/thueringen.py
Outdated
for i in range(min(days_open), 9): | ||
if i not in days_open: | ||
closed_date = mensa_start_date + timedelta(days=i-2) | ||
if not today or closed_date.day is datetime.today().day: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above regarding is
and the date comparison.
parsers/thueringen.py
Outdated
for meal in meals: | ||
main_dish = meal[0] | ||
notes = meal[1] | ||
costs = meal[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can destruct this in one line:
main_dish, notes, costs = meal
parsers/thueringen.py
Outdated
if guests_fee is not -1: | ||
costs['other'] = costs['student'] + guests_fee | ||
if today: | ||
if current_date.day is datetime.today().day: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is
operator compares object identity. While this "works" in this case due to an implementation detail, you should really just do a regular compare: ==
Also you should compare the date objects directly and not the day property.
parsers/thueringen.py
Outdated
None) | ||
else: | ||
canteen.addMeal(current_date.date(), 'Hauptgerichte', main_dish, notes, costs, | ||
None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should check if today and current_date != datetime.today()
before parsing the day and just continue
if it is the case.
- removed additional testing urls - reworked parse_start_date function - global constants for default fees - using != and == instead of is and is not - comparing date objects directly - handling empty days_open list - some general stuff he mentioned
I took care of the changes you requested. |
Hi there, I know you are pretty busy atm, but I just wanted to check if there is any progress regarding the PullRequest :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good already!
The only missing larger issue is the use of readable ingredient notes. It would also be nice to extend the feed to the following two weeks, but we can also implement this later.
The rest are just some minor improvement suggestions.
parsers/thueringen.py
Outdated
fees = parse_fees(document) | ||
|
||
employees_fee = fees[0] | ||
guests_fee = fees[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
employees_fee, guests_fee = parse_fees(document)
parsers/thueringen.py
Outdated
if employees_fee != employees_fee_default: | ||
costs['employee'] = costs['student'] + employees_fee | ||
if guests_fee != guests_fee_default: | ||
costs['other'] = costs['student'] + guests_fee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use None
as the missing values, then you don't need the "constants" and can just check using is not None
here.
parsers/thueringen.py
Outdated
def parse_fees(document): | ||
fees_regex = re.compile('Bedienstete.*') | ||
|
||
fees_p = document.find_all('p', {'class': 'MsoNormal'}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use class_
instead of the dictionary.
Also, you can directly search for the text using the regex:
fees = document.find_all('p', class_='MsoNormal', string=fees_regex)
parsers/thueringen.py
Outdated
continue | ||
|
||
canteen.addMeal(current_date.date(), 'Hauptgerichte', main_dish, notes, costs, | ||
None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coudn't we use Mittag Ausgabe A
, Mittag Ausgabe B
, etc as categories instead of Hauptgerichte
for everything?
parsers/thueringen.py
Outdated
|
||
for meal_data in meals_t_rows: | ||
meal_t_datas = meal_data.find_all('td') | ||
if len(meal_t_datas) is not 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use !=
instead of is not
.
parsers/thueringen.py
Outdated
|
||
ingredients = ingredients_regex.findall(meal_t_datas[1].text) | ||
if len(ingredients) > 0: | ||
notes.append(ingredients[0].strip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a single note for every ingredient. Also please parse the table at the bottom of the page to write out the abbreviations in full. There are some helpers for this in https://github.com/mswart/pyopenmensa/blob/7b6e7a70ab63530af27b985019f28863f8b8f368/feed.py#L182-L246 that can parse ingredient tables and automatically replace the abbreviations. I can also help you out with this if you want.
parsers/thueringen.py
Outdated
students_fee = float(students_fee_string[0].replace(',', '.')) | ||
costs = {'student': students_fee} | ||
|
||
meals.append((main_dish, notes, costs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just yield
the tuples instead of storing them in a list and then returning the list.
parsers/thueringen.py
Outdated
canteen = LazyBuilder() | ||
|
||
content = urlopen(url).read() | ||
document = parse(content, 'lxml') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are only parsing the current week currently, which means that the full fead does not contain the meals for Monday on Sunday, but last Friday instead, which is not very useful.
To get the next week, you have to send a request with the selWeek=2018-25
parameter, where 25 is the calender week. It seems you can also get a list view using selView=liste
which contains the explicit date, which might be easier to parse that the tabbed view.
The website uses POST-requests, but it seems GET also works: https://www.stw-thueringen.de/deutsch/mensen/einrichtungen/ilmenau/mensa-ehrenberg.html?selWeek=2018-25&selView=liste
- implemented the syntactic changes klemens requested - categories are now named by Ausgabe A,B,C etc. according to the studentenwerk website - ingredients are parsed according to the table on the bottom of the studentenwerk website, and appended in full as notes (one note for each ingredient / allergen) to the meals - meals are fetched for the current week + 2 as klemens suggested via GET request (selWek=...)
First of all, thanks for your feedback! I haven't worked much with python so far so every bit of feedback that helps me to enhance my knowledge of the language is great 👍 Thanks for your help in advance :) -btw I referenced the wrong issue in my commit message by accident, sorry for that ;) |
I am not 100% sure, but I think that newer data always overwrites older data. Alternatively, you could just return no data for those days and not bother with the "canteen closed" status at all.
Don't worry, we can clean up the commit history before merging! |
Allrighty, days that don't appear on the websites tabs are now being omitted 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update! We are almost there, just some minor fixes remain. I also added some improvement suggestions, as you seem to like those. 😉
parsers/thueringen.py
Outdated
|
||
def parse_url(url, today=False): | ||
year = datetime.now().isocalendar()[0] | ||
week = datetime.now().isocalendar()[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
year, week, _ = datetime.now().isocalendar()
parsers/thueringen.py
Outdated
|
||
canteen = LazyBuilder() | ||
|
||
for i in range(0, 3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Zero is the default, so you can just use range(3)
.
parsers/thueringen.py
Outdated
|
||
for i in range(0, 3): | ||
|
||
content = urlopen(url+'?selWeek='+str(year)+'-'+str(week + i)).read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This formats the week number without a leading zero if it is less then 10. Now I have no idea what this number should look like (and I guess we have to wait until the end of the year to find out), but if I were to implement the website, I would use a leading zero. 😉 You can use a format string to format the number (and the other arguments):
"{}?selWeek={}-{:02}".format(…)
parsers/thueringen.py
Outdated
day_id = int(day['id'][-1:]) | ||
|
||
current_date = mensa_start_date + timedelta(days=day_id-2) | ||
digits, chars = parse_ingredients(document) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same for all days, so you only need to parse it once.
parsers/thueringen.py
Outdated
option = document.find("option", value=calendar_week_regex, selected=True) | ||
|
||
if option is None: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you return None here? You don't check for None after calling the function and I think the parser should fail if it can't extract the date.
parsers/thueringen.py
Outdated
if chars.get(ingredient) is not None: | ||
ingredients.append(chars[ingredient][1:]) | ||
|
||
return ingredients |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this whole function in one line using list comprehensions if you want to:
notes = [legend[note] for note in notes if note in legend]
parsers/thueringen.py
Outdated
if len(category) == 1: | ||
category = category[0] | ||
else: | ||
category = 'Hauptgerichte' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(category) == 1
will always be true, because you just created a list with length 1 above? Did you mean to check if the string was empty?
parsers/thueringen.py
Outdated
ingredients = ingredients_regex.findall(meal_t_datas[1].text) | ||
|
||
if len(ingredients) > 0: | ||
ingredients_list = ingredients[0].strip('Inhalt: ').strip('\r').split(',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use strip()
instead of strip('\r')
. Also you should strip the resulting list items (maybe combined with the list comprehension from above?), because there may be some spaces in there sometimes.
parsers/thueringen.py
Outdated
|
||
canteen = LazyBuilder() | ||
|
||
for i in range(0, 3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably only fetch the current week when the today parser is requested.
parsers/thueringen.py
Outdated
costs['employee'] = costs['student'] + employees_fee | ||
if guests_fee is not None: | ||
costs['other'] = costs['student'] + guests_fee | ||
if today and current_date != datetime.today(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./parse.py thueringen il-ehrenberg today.xml
currently returns an empty result for me, because this comparison between dates is not actually between dates, but between datetimes. It should be current_date != date.today()
and parse_start_date
should return a date instead of a datetime (just use .date()
before returning).
- implemented the proposed syntactic changes - weeks to be fetched now determined from <select> elements options in the document so only weeks that are available from the website are being fetched. As a bonus we don't need to care about formatting with leading zeroes etc. - variables that don't change are only being parsed once - fess should be parsed even if more paragraphs with Bedienstete etc. exist - reworked the ingredients parsing as proposed which ultimately led to the drop of a whole function as this could be done in a one-liner - dates should now be correctly compared without time so the today feed should work again
I implemented the proposed changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, parsing the actual values is much better!
The code looks really good now, there are just some optimizations remaining. 👍
parsers/thueringen.py
Outdated
for s in table.stripped_strings: | ||
if groups_regex.search(s): | ||
split = s.split(':') | ||
groups[split[0]] = split[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The object returned by search
already contains the matched groups (the parenthesis in the regex), which can be accesses using group(1)
and group(2)
, so you don't need split
. You should also strip the description here instead of in the list comprehension in line 84.
parsers/thueringen.py
Outdated
ingredients = ingredients_regex.findall(meal_t_datas[1].text) | ||
|
||
if len(ingredients) > 0: | ||
ingredients_list = ingredients[0].strip('Inhalt: ').strip().split(',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this in the last review. strip('Inhalt: ')
probably does not do what you think it does. (I suggest just using [8:]
or a regex)
content = urlopen(url).read() | ||
document = parse(content, 'lxml') | ||
|
||
available_weeks = parse_available_weeks(document) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fetches the current week's site twice. I suggest skipping the current week and using the already available document
for adding the meals. It probably makes sense to split the parse_url
function into two functions, one which fetches the sites and checks which are available and the other does the actual parsing (what's currently in the for loop).
parsers/thueringen.py
Outdated
costs['employee'] = costs['student'] + employees_fee | ||
if guests_fee is not None: | ||
costs['other'] = costs['student'] + guests_fee | ||
if today and current_date != date.today(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still fetches all sites, even when just the first one is needed. You need another check that only fetches the current site for today feeds (see also the comment above).
parsers/thueringen.py
Outdated
|
||
|
||
def parse_available_weeks(document): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I find empty lines after function definitions of for loops (see below) a little bit confusing. They also seem to confuse the git diff algorithm. 😉
- implemented the proposed syntactic changes - removed blank lines under function definitions - using matched groups instead of string splitting for ingredients parsing - sites should only be fetched when necessary (first page only once, following pages only if not today etc.) - parsing of meals for canteen now separated in new function parse_meals_for_canteen so parse_url only responsible for fetching necessary information (html documents, weeks, groups, fees etc.) and returning the xml feed
I implemented the proposed changes - thanks for your feedback 👍 |
I just found out my solution might not work properly yet. If accessed on weekends / Sundays the Studentenwerk page does not auto select a week which currently crashes the parser on those days. I am working on a fix. |
Should be fixed now |
Sorry for – once again – taking so long. The changes look good, I just pushed a small change to replace the I will now merge this (sqashed into one commit) into the master branch, which means that it will become available at Thanks again for writing the parser and following my (sometimes nitpicky) review! 😁 |
Thanks for the update! Again thanks for the continuous feedback, was really fun writing this parser and learning a few new things and best practices about python 👍 |
Soooo, I just moved all of the already existing canteens into one parser (they were separated by city) and adapted their names to your new naming scheme. After adding the new index-url of my server, they started to work immediately. 😉
I did not add the other canteens yet, mainly because I wanted to automate it (see the halle parser and my changes to it today). My plan is as follows: The city overview page (eg Jena) contains a list of canteens including their addresses. It also contains a hidden form, which contains geo-coordinates for all canteens (used for the map display). This way we should be able to generate the complete metadata info and avoid copying it by hand into openmensa (which of course would be much faster, but missing all the fun 😁). |
I implemented the metadata extraction in 90b95c9 and fixed a small issue, where an empty meal name would lead to an exception, because they are not allowed. Now, empty meals are just skipped: fddcca4 The extraction worked great for all but three canteens:
These three were missing in the map display, so I created them manually: I did some quick sanity checks (the coordinates are in the right city), but it would be great if you could also take a look! |
Added Parser for
from TU Ilmenau.
http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/ilmenau/mensa-ehrenberg.html
http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/ilmenau/cafeteria-mensa-ehrenberg.html
http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/ilmenau/cafeteria-nanoteria.html
http://www.stw-thueringen.de/deutsch/mensen/einrichtungen/ilmenau/cafeteria-roentgenbau.html
My current implementation marks days in the past as closed. Please let me know if that is not the desired behavior and I will change it :)