Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: LxmlLinkExtractor unique_list missing key #6221

Merged
merged 2 commits into from
Apr 19, 2024

Conversation

jxlil
Copy link
Contributor

@jxlil jxlil commented Feb 14, 2024

Added key argument to unique_list in LxmlLinkExtractor

Before:

seenkey: Link(url='http://example.com/page1', text='Página 1', fragment='', nofollow=False)

After:

seenkey: http://example.com/page1

Closes #3273

Copy link

codecov bot commented Feb 14, 2024

Codecov Report

Merging #6221 (5e51417) into master (1c9d308) will increase coverage by 0.04%.
Report is 68 commits behind head on master.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6221      +/-   ##
==========================================
+ Coverage   88.55%   88.60%   +0.04%     
==========================================
  Files         160      161       +1     
  Lines       11607    11775     +168     
  Branches     1883     1908      +25     
==========================================
+ Hits        10279    10433     +154     
- Misses       1003     1011       +8     
- Partials      325      331       +6     
Files Coverage Δ
scrapy/linkextractors/lxmlhtml.py 95.55% <100.00%> (ø)

... and 9 files with indirect coverage changes

@Gallaecio
Copy link
Member

Can you add a test for it?

@jxlil
Copy link
Contributor Author

jxlil commented Feb 15, 2024

I think that in tests/test_linkextractors.py all cases are already being covered. On codecov.io it also seems that this change is covered: https://app.codecov.io/gh/scrapy/scrapy/pull/6221

@Gallaecio
Copy link
Member

If this was properly covered, the tests would have failed before the change. Let me give it a go later on.

@jxlil
Copy link
Contributor Author

jxlil commented Feb 23, 2024

Hi @Gallaecio I see that you added some tests (some were also added here: #6232)

Is there something missing that I can help with?

Copy link
Member

@Gallaecio Gallaecio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, I forgot to approve 🤦

cc @wRAR

@jxlil
Copy link
Contributor Author

jxlil commented Feb 23, 2024

Oh ok, no problem. Thanks!

@jxlil
Copy link
Contributor Author

jxlil commented Apr 2, 2024

Hi @wRAR ,

Hope you're doing well. Are any additional changes needed in this PR? Thank you.

@wRAR
Copy link
Member

wRAR commented Apr 4, 2024

@jxlil this is still in my backlog, I hope to revisit it soon :)

Copy link
Member

@wRAR wRAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@wRAR wRAR merged commit a5da77d into scrapy:master Apr 19, 2024
26 checks passed
@jxlil jxlil deleted the fix/LxmlLinkExtractor branch April 19, 2024 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LxmlLinkExtractor unique_list missing key
3 participants