Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tldextract.update() strange behaviour #193

Closed
commutecat opened this issue Mar 13, 2020 · 2 comments
Closed

tldextract.update() strange behaviour #193

commutecat opened this issue Mar 13, 2020 · 2 comments

Comments

@commutecat
Copy link

I just try to make a script to find out whether a domain make use of ccTLD, and differentiate between public or private domains.

So I initiate 2 explicit tldextract class that deal with this with explicit include_psl_private_domains switch. However, I notice the class behave differently if you "trigger" the extract ASAP than later on.

Code #1 show include_psl_private_domains=True works if I "test" the class function immediately.

# Trigger the extract immeidately 
import tldextract
url = "www.blogspot.co.za"
expub = tldextract.TLDExtract(include_psl_private_domains=False)
expub.update()
expub(url)
# result : ExtractResult(subdomain='www', domain='blogspot', suffix='co.za')

expri = tldextract.TLDExtract(include_psl_private_domains=True)
expri.update()
expri(url)
# result : ExtractResult(subdomain='', domain='www', suffix='blogspot.co.za')

url2 = "another.blogspot.co.uk"
expub(url2)
# result : ExtractResult(subdomain='another', domain='blogspot', suffix='co.uk')
expri(url2)
# result : ExtractResult(subdomain='', domain='another', suffix='blogspot.co.uk')

In following code, include_psl_private_domains=True doesn't work as intended

import tldextract
url = "www.blogspot.co.za"
expub = tldextract.TLDExtract(include_psl_private_domains=False)
expub.update()
expri = tldextract.TLDExtract(include_psl_private_domains=True)
expri.update()

expub(url)
# result : ExtractResult(subdomain='www', domain='blogspot', suffix='co.za')
expri(url) 
# result : ExtractResult(subdomain='www', domain='blogspot', suffix='co.za')

# Try another domain 
url2 = "another.blogspot.co.uk"
expub(url2)
# result : ExtractResult(subdomain='another', domain='blogspot', suffix='co.uk')
expri(url2) 
# result : ExtractResult(subdomain='another', domain='blogspot', suffix='co.uk')
@john-kurkowski
Copy link
Owner

I think this is covered by #66.

@floer32
Copy link
Collaborator

floer32 commented Mar 23, 2020

Yes, duplicate of #66.

@floer32 floer32 closed this as completed Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants