Fix Cocktail Party scraper #270

zdenek-biberle · 2024-04-01T17:26:24Z

Cocktail Party have changed their website. It now appears to be using something called Elementor. Quite possibly https://elementor.com/. This has caused the old scraper to stop working.

This commit fixes the scraper. I've tried it with 192 cocktails from Cocktail Party and, except for a few, it worked fine. The ones that didn't work were using units that recipe-utils doesn't understand, such as "stick" or "half," and thus I assume that they wouldn't've worked with the old scraper as well.

I've been stumped a bit by the change in 2c4c6a4#diff-21fdca419f69f37e6f27b80fc3a35eae8faa9e1920dd2762e8bba5a8d081a891 - I'm not exactly sure what the plan is with the sort field, but it seems to me that it has broken scraping. Thus I've changed how the ingredients field is build in AbstractSiteExtractor - it now includes the sort field as well. Please let me know if that's okay or not.

Develop

Generate slug if missing

Develop

Add copy, add external classes

Sync only cocktail thumbs

Cocktail Party have changed their website. It now appears to be using something called Elementor. Quite possibly https://elementor.com/. This has caused the old scraper to stop working. This commit fixes the scraper. I've tried it with 192 cocktails from Cocktail Party and, except for a few, it worked fine. The ones that didn't work were using units that recipe-utils doesn't understand, such as "stick" or "half," and thus I assume that they wouldn't've worked with the old scraper as well.

karlomikus · 2024-04-02T14:23:21Z

Nice, lgtm.

Sort is used for sorting of ingredients, in scraping context its usually just autoincrementing from 1 for each ingredient.

Thanks!

zdenek-biberle · 2024-04-02T16:45:46Z

Ah, I missed the fact that there is a develop branch. Probably should've based my changes on top of it instead of on master, right?

Sort is used for sorting of ingredients, in scraping context its usually just autoincrementing from 1 for each ingredient.

That was basically my understanding of it as well, so that's good.

Thanks!

No worries, thank you!

This is a bit of a successor to karlomikus#270. This commit adds four improvements to the Cocktail Party scraper: First, some Cocktail Party recipes use units that the recipe-utils parser doesn't understand. For example, the [Manhattan Bianco](https://cocktailpartyapp.com/drinks/manhattan-bianco/) uses a "piece." Such ingredients would've simply been presented to the user without the unit and the user had to fill that in themselves. Now the code will fall back to whatever the parser didn't parse, which is a fairly good default for Cocktail Party. Next, the Cocktail Party website uses "parts" for lots of ingredients, but they actually mean fluid ounces (i.e. the same recipes in their mobile app show up with fluid ounces instead of parts). Thus the scraper now maps parts to fluid ounces. Next, the scraper now reads the links in the "post info" part of the page as tags. The links usually provide categories or names of the cocktail's creator, so this works out nicely. Finally, I've fixed an oversight introduced in karlomikus#270. The code for parsing the cocktail's description goes through all the paragraphs and then joins them up to form proper Markdown paragraphs. However, those paragraphs were then squashed together within the toArray() function in the clean up process. That's obviously undesirable. So now the paragraphs are cleaned up before they're joined together, which produces nice Markdown with multiple paragraphs.

karlomikus and others added 6 commits January 27, 2024 14:18

Merge pull request karlomikus#260 from karlomikus/develop

7ffca47

Develop

Merge pull request karlomikus#261 from karlomikus/develop

a09aac4

Generate slug if missing

Merge pull request karlomikus#262 from karlomikus/develop

6507865

Develop

Merge pull request karlomikus#263 from karlomikus/develop

04af7fb

Add copy, add external classes

Merge pull request karlomikus#264 from karlomikus/develop

5e6bd12

Sync only cocktail thumbs

karlomikus changed the base branch from master to develop April 2, 2024 08:00

karlomikus merged commit b643ff9 into karlomikus:develop Apr 2, 2024
1 check passed

zdenek-biberle mentioned this pull request Apr 16, 2024

Some improvements to the Cocktail Party scraper #273

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Cocktail Party scraper #270

Fix Cocktail Party scraper #270

zdenek-biberle commented Apr 1, 2024

karlomikus commented Apr 2, 2024

zdenek-biberle commented Apr 2, 2024

Fix Cocktail Party scraper #270

Fix Cocktail Party scraper #270

Conversation

zdenek-biberle commented Apr 1, 2024

karlomikus commented Apr 2, 2024

zdenek-biberle commented Apr 2, 2024