Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: add option to probe for extractable content, more robust downloads and html2txt #378

Merged
merged 13 commits into from Jun 22, 2023

Conversation

adbar
Copy link
Owner

@adbar adbar commented Jun 20, 2023

No description provided.

@adbar adbar marked this pull request as draft June 20, 2023 15:44
@codecov
Copy link

codecov bot commented Jun 20, 2023

Codecov Report

Merging #378 (e40b465) into master (d1d5081) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #378      +/-   ##
==========================================
+ Coverage   96.68%   96.70%   +0.02%     
==========================================
  Files          22       22              
  Lines        3404     3425      +21     
==========================================
+ Hits         3291     3312      +21     
  Misses        113      113              
Impacted Files Coverage Δ
trafilatura/cli.py 93.37% <100.00%> (+0.13%) ⬆️
trafilatura/cli_utils.py 91.81% <100.00%> (+0.38%) ⬆️
trafilatura/core.py 98.02% <100.00%> (+<0.01%) ⬆️
trafilatura/downloads.py 97.00% <100.00%> (+0.01%) ⬆️
trafilatura/utils.py 98.51% <100.00%> (+0.03%) ⬆️

@adbar adbar changed the title CLI: add probing mode CLI: add option to probe for extractable content Jun 22, 2023
@adbar adbar changed the title CLI: add option to probe for extractable content CLI: add option to probe for extractable content, more robust downloads and html2txt Jun 22, 2023
@adbar adbar marked this pull request as ready for review June 22, 2023 12:09
@adbar adbar merged commit 2f1fd35 into master Jun 22, 2023
13 checks passed
@adbar adbar deleted the probe_website branch June 22, 2023 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant