Skip to content

Conversation

honzajavorek
Copy link
Collaborator

Part of #1584


⚠️ 🐍 This PR contains also a small change to the Python course, to keep the lessons consistent and synced.

@honzajavorek honzajavorek requested review from gullmar and TC-MO August 26, 2025 07:03
@honzajavorek honzajavorek added the t-academy Issues related to Web Scraping and Apify academies. label Aug 26, 2025
@apify-service-account
Copy link

Preview for this PR was built for commit a083a57 and is ready at https://pr-1846.preview.docs.apify.com!

Copy link
Contributor

@TC-MO TC-MO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@honzajavorek honzajavorek force-pushed the honzajavorek/js2-scraping-variants branch from a083a57 to 0d22cf5 Compare September 2, 2025 08:46
@apify-service-account
Copy link

Preview for this PR was built for commit 0d22cf5 and is ready at https://pr-1846.preview.docs.apify.com!

@honzajavorek
Copy link
Collaborator Author

Thanks! I noticed there was a conflict with master now, so I resolved it. I'll wait for @gullmar to check code before merging.

@honzajavorek
Copy link
Collaborator Author

@cursor review

cursor[bot]

This comment was marked as outdated.

@honzajavorek
Copy link
Collaborator Author

@cursor review

@apify-service-account
Copy link

Preview for this PR was built for commit 0bf1ec4c and is ready at https://pr-1846.preview.docs.apify.com!

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!


Comment @cursor review or bugbot run to trigger another review on this PR

item = parse_product(product, listing_url)
product_soup = download(item["url"])
vendor = product_soup.select_one(".product-meta__vendor").text.strip()
const $promises = $(".product-item").map(async (i, element) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const $promises = $(".product-item").map(async (i, element) => {
const promises = $(".product-item").toArray().map(async (element) => {


return item;
});
const data = await Promise.all($promises.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const data = await Promise.all($promises.get());
const data = await Promise.all(promises);

const $ = await download(listingURL);

Python dictionaries are mutable, so if we assigned the variant with `item["variant_name"] = ...`, we'd always overwrite the values. Instead of saving an item for each variant, we'd end up with the last variant repeated several times. To avoid this, we create a new dictionary for each variant and merge it with the `item` data before adding it to `data`. If we don't find any variants, we add the `item` as is, leaving the `variant_name` key empty.
const $promises = $(".product-item").map(async (i, element) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const $promises = $(".product-item").map(async (i, element) => {
const promises = $(".product-item").toArray().map(async (element) => {

// highlight-end
});
// highlight-start
const itemLists = await Promise.all($promises.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const itemLists = await Promise.all($promises.get());
const itemLists = await Promise.all(promises);

}
return [{ variantName: null, ...item }];
});
const itemLists = await Promise.all($promises.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const itemLists = await Promise.all($promises.get());
const itemLists = await Promise.all(promises);

const listingURL = "https://www.npmjs.com/search?page=0&q=keywords%3Allm&sortBy=dependent_count";
const $ = await download(listingURL);

const $promises = $("section").map(async (i, element) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const $promises = $("section").map(async (i, element) => {
const promises = $("section").toArray().map(async (element) => {

return { name, url, description, dependents, downloads };
});

const data = await Promise.all($promises.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const data = await Promise.all($promises.get());
const data = await Promise.all(promises);

const listingURL = "https://edition.cnn.com/sport";
const $ = await download(listingURL);

const $promises = $(".layout__main .card").map(async (i, element) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const $promises = $(".layout__main .card").map(async (i, element) => {
const promises = $(".layout__main .card").toArray().map(async (element) => {

return { url: articleURL, length: content.length };
});

const data = await Promise.all($promises.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const data = await Promise.all($promises.get());
const data = await Promise.all(promises);

Co-authored-by: gullmar <gullmar@mailbox.org>
@apify-service-account
Copy link

Preview for this PR was built for commit 90d24b39 and is ready at https://pr-1846.preview.docs.apify.com!

@honzajavorek
Copy link
Collaborator Author

Thank you! 🙇‍♂️

@honzajavorek honzajavorek merged commit 6797d5f into master Sep 3, 2025
9 checks passed
@honzajavorek honzajavorek deleted the honzajavorek/js2-scraping-variants branch September 3, 2025 08:04
daveomri pushed a commit to daveomri/apify-docs that referenced this pull request Sep 3, 2025
…ut JavaScript (apify#1846)

Part of apify#1584

----

⚠️ 🐍 This PR contains also a small change to the Python course, to keep
the lessons consistent and synced.

---------

Co-authored-by: gullmar <gullmar@mailbox.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-academy Issues related to Web Scraping and Apify academies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants