Skip to content

Conversation

@honzajavorek
Copy link
Collaborator

@honzajavorek honzajavorek commented Nov 18, 2025

This is a proof of concept how we could test exercises in the academy:

  • Exercises target real world websites, so they can easily break without us knowing
  • We could test them e.g. weekly, so it's not too noisy
  • The solution should be as simple as possible, and capable running both JavaScript and Python
  • While working on the PoC I actually discovered one exercise which is broken 95% of time due to aggressive anti-scraping protections, so I changed it
  • The test can be executed with bats -r --print-output-on-failure .
  • Bats is a simple testing framework based on Bash, which allows to run arbitrary programs and evaluate their output - https://github.com/bats-core/bats-core Not sure yet how to get it inside the CI, but on macOS it's just brew install bats

Todo:

  • Implement Python exercise
  • Implement JS exercise
  • Implement GitHub Action
  • Discuss with the team whether we want this
  • Document the solution
  • Port the rest of the exercises
  • Change the YAML to do just crons

Note

Adds a monthly GitHub Action and Bats-based suite to run Academy JS/Python exercise solutions, embeds solutions into lessons, and updates docs/ignore files.

  • CI:
    • Add monthly and manual workflow .github/workflows/test-academy.yml to run Academy exercises via Node (npm) and Python (uv).
  • Testing:
    • Introduce Bats-based tests for Academy exercises: sources/academy/**/exercises/test.bats (JS & Python) executing solutions and asserting outputs.
    • New npm script test:academy and dev dependency bats.
  • Docs/Academy content:
    • Embed executable exercise solutions using CodeBlock + !!raw-loader across JS/Python lessons (04–12), replacing inline snippets.
    • Update several exercises (e.g., switch example from AliExpress to LEGO) and add multiple new solution files (Cheerio/BeautifulSoup/Crawlee).
  • Repo:
    • Update .gitignore to exclude exercise artifacts (storage, node_modules, package*.json, dataset.json).
    • Document testing process in CONTRIBUTING.md (broken links check + Academy exercises CI).

Written by Cursor Bugbot for commit 118a550. Configure here.

@honzajavorek honzajavorek added the t-academy Issues related to Web Scraping and Apify academies. label Nov 18, 2025
@honzajavorek honzajavorek changed the title Test Academy exercises chore: test Academy exercises Nov 18, 2025
@honzajavorek honzajavorek force-pushed the honzajavorek/test-exercises branch from ebf76a7 to 05099c1 Compare November 21, 2025 16:35
@apify-service-account
Copy link

Preview for this PR was built for commit 05099c1 and is ready at https://pr-2097.preview.docs.apify.com!

@tomnosek
Copy link
Contributor

From my perspective, we should design them in a way that they run automatically and do not require manual start-up. Otherwise, we'll forget about it eventually.

@honzajavorek
Copy link
Collaborator Author

Sure, I'd make a GitHub Action, which runs like once a week or once a month (depends on our ability to fix the exercises, doesn't make sense to run them too often).

@honzajavorek
Copy link
Collaborator Author

(Creating such GitHub Action is a matter of a few lines and I'll add it to this PR)

@apify-service-account
Copy link

Preview for this PR was built for commit cf913593 and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit cb4959b2 and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 5e78a4db and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 463457b9 and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 0214184a and is ready at https://pr-2097.preview.docs.apify.com!

@honzajavorek honzajavorek force-pushed the honzajavorek/test-exercises branch from 0214184 to b930d1f Compare November 24, 2025 17:18
@apify-service-account
Copy link

Preview for this PR was built for commit b930d1f and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 4fccbf45 and is ready at https://pr-2097.preview.docs.apify.com!

@honzajavorek honzajavorek force-pushed the honzajavorek/test-exercises branch from 4fccbf4 to 96ae391 Compare November 25, 2025 11:59
@apify-service-account
Copy link

Preview for this PR was built for commit 96ae391 and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 44d7b1a and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit ddf09129 and is ready at https://pr-2097.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 6c5f05bb and is ready at https://pr-2097.preview.docs.apify.com!

@honzajavorek honzajavorek mentioned this pull request Nov 25, 2025
4 tasks
@honzajavorek honzajavorek force-pushed the honzajavorek/test-exercises branch from 6c5f05b to 118a550 Compare November 25, 2025 14:46
@apify-service-account
Copy link

Preview for this PR was built for commit 118a550 and is ready at https://pr-2097.preview.docs.apify.com!

@honzajavorek
Copy link
Collaborator Author

I think this should be ready now. The Bats tests currently do fail, because there are failing exercises, so that's expected. But the testing infrastructure is solid and this PR is about the infrastructure. I've set the frequency of the tests to monthly and let's see. I didn't want to snowball this PR, so I recorded the failures and other issues separately and I'll work on them in subsequent PRs:

@honzajavorek honzajavorek marked this pull request as ready for review November 25, 2025 14:50
@honzajavorek honzajavorek requested a review from B4nan November 25, 2025 14:50
- Run `vale sync` to download styles
- Configure exceptions in `accepts.txt`

### Testing
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TC-MO Can you take a look at this README change, please? Does it make sense this way?

"dependencies": {
"@apify/ui-library": "^1.97.2",
"@apify/ui-icons": "^1.19.0",
"@apify/ui-library": "^1.97.2",
Copy link
Collaborator Author

@honzajavorek honzajavorek Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not my change, npm re-ordered this on its own 👀

<Exercises />

### Scrape AliExpress
### Scrape LEGO
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only I fixed. After this one I decided fixing the exercises should be in separate PRs, not in this one: #2113

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Comment @cursor review or bugbot run to trigger another review on this PR

@apify-service-account
Copy link

Preview for this PR was built for commit 8c875d1 and is ready at https://pr-2097.preview.docs.apify.com!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-academy Issues related to Web Scraping and Apify academies.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants