Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't require <input>; Find all supported files in cwd automatically #410

Open
fregante opened this issue Nov 29, 2021 · 10 comments
Open

Don't require <input>; Find all supported files in cwd automatically #410

fregante opened this issue Nov 29, 2021 · 10 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@fregante
Copy link

Lychee supports a lot of files, I'd rather not have to maintain my own glob list. Can it just find and check all the supported filetypes like Prettier does for example?

@mre
Copy link
Member

mre commented Nov 29, 2021

Basic support for this will be merged together with #330. It provides directory support and will check html and markdown by default. We want to be a bit conservative in the beginning as we're not sure if everyone likes to have their txt files checked with the fuzzy parser. There will probably be an option to overwrite this. Something like --extensions md,html,txt or so.

# Check all Markdown and HTML files recursively
lychee .
# Check Text, JSON, YAML as well
lychee . --extensions md,html,txt,json,yaml

@mre
Copy link
Member

mre commented Nov 29, 2021

The question is if we also want to test the current directory when omitting the ..
So

lychee

would be the same as

lychee .

I can see good arguments for both. Opinions?

@mre
Copy link
Member

mre commented Jan 5, 2022

I thought about it and decided not to test the current directory when omitting the . and instead show the usage instructions.

@mre
Copy link
Member

mre commented Jan 5, 2022

With #330 being merged, lychee . will check all markdown and html files recursively by default. The missing piece to close this issue is to allow overwriting the extensions with --extensions md,html,txt.

@fregante
Copy link
Author

fregante commented Jan 6, 2022

Including all the supported extensions by default makes more sense to me, with the possibility to override the list with the flag if someone wants fewer extensions. This is because

  • Unless I continuously keep track of lychee's supported filetypes I'll miss out on them
  • I'd exclusively have use the flag to disable some extensions
  • Choosing fewer extensions means the flag will always be shorter than "enabling more extensions", because it's more likely to be used with 1-2 extensions instead of "the whole list"

@mre
Copy link
Member

mre commented Jan 6, 2022

Lychee can extract links from plaintext documents, so technically it could work with files like YAML or XML as well. Even files without an extension might be fine (e.g. README). I wonder if these files should be included by default and if not, where do we draw the line?

@lebensterben
Copy link
Member

@mre
ripgrep has an algorithm to detect whether a file is binary or text.
lychee should use similar algorithm and support all text by default.

@mre
Copy link
Member

mre commented Jan 7, 2022

Yeah, doing what ripgrep does sounds like a sane default. In the end, lychee is like a tiny ripgrep with a different objective. 😉 On the other hand I'm not sure if people would be interested in broken links inside SVG and XLS files. If you have a link to the algorithm I'd love to have a look.

@lebensterben
Copy link
Member

@mre
I agree that we should skip images, videos, and audios.
That is to say, we should skip all binary files plus certain categories. In terminology of XDG shared-mimeinfo, we should filter out files with image/, audio/, video/*.

@mre mre added enhancement New feature or request help wanted Extra attention is needed labels Feb 4, 2022
@mre
Copy link
Member

mre commented Mar 4, 2022

To summarize:

  • Support overwriting extensions (lychee . --extensions md,html,txt,json,yaml would only check these extensions)
  • Check all text files by default. Do what ripgrep does to detect them.
  • Exclude media files (images, videos, and audios)

@mre mre added the good first issue Good for newcomers label Jan 29, 2024
Wumpf added a commit to rerun-io/rerun that referenced this issue Apr 16, 2024
### What

Code files are now included in link checking.

* Fixes #5925

Some issues encountered:
* globs don't respect exclude_files which makes working with this
locally harder lycheeverse/lychee#1405
* can't specify extensions / correct files aren't picked up
automatically lycheeverse/lychee#410

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/5986?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/5986?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!

- [PR Build Summary](https://build.rerun.io/pr/5986)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants