Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deprecated SgmlLinkExtractor and RegexLinkExtractor #4679

Merged
merged 2 commits into from
Jul 17, 2020
Merged

Remove deprecated SgmlLinkExtractor and RegexLinkExtractor #4679

merged 2 commits into from
Jul 17, 2020

Conversation

ashellunts
Copy link
Contributor

@ashellunts ashellunts commented Jul 14, 2020

BaseSgmlLinkExtractor is also removed.

Issue #4356

@ashellunts ashellunts changed the title Remove deprecated class SgmlLinkExtractor (#4356) Remove deprecated class SgmlLinkExtractor Jul 14, 2020
@codecov
Copy link

codecov bot commented Jul 14, 2020

Codecov Report

Merging #4679 into master will increase coverage by 1.03%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4679      +/-   ##
==========================================
+ Coverage   85.16%   86.20%   +1.03%     
==========================================
  Files         162      160       -2     
  Lines        9748     9632     -116     
  Branches     1437     1414      -23     
==========================================
+ Hits         8302     8303       +1     
+ Misses       1186     1070     -116     
+ Partials      260      259       -1     
Impacted Files Coverage Δ
scrapy/utils/trackref.py 85.71% <0.00%> (+2.85%) ⬆️

@Gallaecio
Copy link
Member

@ashellunts
Copy link
Contributor Author

@Gallaecio could you please advise how to proceed? Is RegexLinkExtractor still needed?
If yes, why is it subclass of class SgmlLinkExtractor? What methods from SgmlLinkExtractor should be available in RegexLinkExtractor?

@wRAR
Copy link
Member

wRAR commented Jul 15, 2020

Yeah, this is strange, as __init__ is not overridden you will have a deprecation warning about SgmlLinkExtractor when instantiating RegexLinkExtractor.

OTOH, I just tried this and found scrapy.linkextractors.sgml imports sgmllib which is py2-only so this code just doesn't work anymore.

@Gallaecio
Copy link
Member

In that case, maybe we should remove regex.py as well? No one must be using it in Python 3 if this is the first time we hear about it.

@wRAR
Copy link
Member

wRAR commented Jul 16, 2020

Yeah, I almost suggested that too. We are usually very careful when removing things, but this class should have shown a deprecation warning anyway.

@ashellunts
Copy link
Contributor Author

Ok, thank you for answers. Will update PR.

@ashellunts
Copy link
Contributor Author

Shall I relpace RegexLinkExtractor to just LinkExtractor in these docs?

./sep/sep-016.rst:       legs = [RegexLinkExtractor(), ParseRules(), CanonicalizeUrl(), ItemIdSetter()]
./sep/sep-018.rst:       middlewares = [RegexLinkExtractor(), CallbackRules(), CanonicalizeUrl(), 

@ashellunts ashellunts changed the title Remove deprecated class SgmlLinkExtractor Remove deprecated SgmlLinkExtractor and RegexLinkExtractor Jul 16, 2020
@wRAR
Copy link
Member

wRAR commented Jul 16, 2020

/sep/ is historical so it shouldn't be changed.

I see there are conflicts, can you please resolve them?

BaseSgmlLinkExtractor is also removed.
It uses deprecated SgmlLinkExtractor as base class.
@ashellunts
Copy link
Contributor Author

/sep/ is historical so it shouldn't be changed.

I see there are conflicts, can you please resolve them?

Done.

@ashellunts
Copy link
Contributor Author

Build in travis is green. But github still shows it as "in progress". 🤔

@ashellunts
Copy link
Contributor Author

Ok, link goes to last successful build in Travis. But build for latest commit is not yet started.

@ashellunts
Copy link
Contributor Author

Looks like travis doesn't recognize force push.

@dangra
Copy link
Member

dangra commented Jul 16, 2020

Now it is 🍏 (I've restarted one build to trigger the update on this PR)

@ashellunts
Copy link
Contributor Author

Thank you @dangra.

@Gallaecio Gallaecio merged commit 62a4ede into scrapy:master Jul 17, 2020
@ashellunts
Copy link
Contributor Author

🎉🎉🎉

@Gallaecio
Copy link
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants