Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get the full article. #42

Closed
yaseenox-personal opened this issue Oct 11, 2016 · 4 comments
Closed

Can't get the full article. #42

yaseenox-personal opened this issue Oct 11, 2016 · 4 comments

Comments

@yaseenox-personal
Copy link

Hi, I want to extract the article from the source url. I got only the title of the article and small parts of it under the "description" parameter.

@michaelhelmick
Copy link
Owner

Which url are you trying to parse?

@yaseenox-personal
Copy link
Author

I tried more than one article, this one for example: 'http://www.bbc.com/news/business-37618618'

@michaelhelmick
Copy link
Owner

Hm, it looks like they're setting meta description as well as og:description and twitter:description

You won't be able to extract the article with lassie but you will be able to get the description. I'll see if I can fix this.

michaelhelmick added a commit that referenced this issue Oct 21, 2016
Fixes #42, resolve issue where data would not be returned if key was in dictionary
@michaelhelmick
Copy link
Owner

This is fixed, version 0.8.3 is now available on pypi!

The site you posted now returns:

{
    'site_name': u 'BBC News',
    'description': u 'Samsung has ceased production of its Galaxy Note 7 smartphones after reports of devices it had deemed safe catching fire.',
    'videos': [],
    'title': u 'Samsung permanently stops Galaxy Note 7 production',
    'url': u 'http://www.bbc.com/news/business-37618618',
    'status_code': 200,
    'locale': u 'en_GB',
    'images': [{
        'src': 'http://www.bbc.com/news/business-37618618',
        'height': None,
        'width': None
    }, {
        'src': u 'http://ichef.bbci.co.uk/news/1024/cpsprodpb/577B/production/_91759322_h2h1xuaj.jpg',
        'type': u 'og:image'
    }]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants