Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add freecodecamp support #804

Closed
rgaudin opened this issue Jul 21, 2023 · 4 comments · Fixed by #815
Closed

Add freecodecamp support #804

rgaudin opened this issue Jul 21, 2023 · 4 comments · Fixed by #815

Comments

@rgaudin
Copy link
Member

rgaudin commented Jul 21, 2023

New offliner for https://github.com/openzim/freecodecamp

@benoit74 benoit74 self-assigned this Jul 27, 2023
@benoit74
Copy link
Collaborator

benoit74 commented Jul 27, 2023

Small points that needs to be confirmed :

  • we support only the all command of fcc2zim (individual commands fetch, prebuild, zim) makes no sense in Zimfarm
  • we need a new platform in addition to the new offliner, in order to limit the number of tasks for fcc2zim per worker
  • we limit to 1 fcc2zim task per worker for now (this is not managed in this repo but better ask the question now)
  • output flag is not standard: "outpath" instead of "output"
  • there is no support for the JSON stat file in fcc2zim (for now)

@rgaudin
Copy link
Member Author

rgaudin commented Jul 27, 2023

* we support only the `all` command of `fcc2zim` (individual commands `fetch`, `prebuild`, `zim`) makes no sense in Zimfarm

Yes ; ZF doesn't care about scraper internals. We want to be able to supply the list of courses, the metadata and get a ZIM.

* we need a new platform in addition to the new offliner, in order to limit the number of tasks for fcc2zim per worker
* we limit to 1 fcc2zim task per worker for now (this is not managed in this repo but better ask the question now)

No. Those are created only if we encounter issues.

* output flag is not standard: "outpath" instead of "output"

Right.

* there is no support for the JSON stat file in fcc2zim (for now)

Right.

@benoit74
Copy link
Collaborator

This task is not possible to complete, work on openzim/freecodecamp#11 needs to be finished (at least first point on CLI parameters) for this task to make sense (otherwise no one but few devs will be able to create recipes for this scraper)

@benoit74 benoit74 removed their assignment Jul 27, 2023
@benoit74
Copy link
Collaborator

benoit74 commented Jul 27, 2023

I pushed some WIP code on the branch add_fcc (https://github.com/openzim/zimfarm/tree/add_fcc). Probably mostly ready except Flags / CLI parameters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants