Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] How to ignore/exclude owner_username ? #113

Closed
nagualcode opened this issue May 4, 2018 · 3 comments

Comments

@nagualcode
Copy link

commented May 4, 2018

I am crawling #hashtags, but would like to exclude a specific owner_username.
Can I do this with --post-filter= ?

@nagualcode

This comment has been minimized.

Copy link
Author

commented May 4, 2018

or better yet: Download only 1 (the most recent) post from each user.
So a crawl with a count of -C 50 (for example) cannot get overtaken by a few users who overuse the given hashtag.

@aandergr

This comment has been minimized.

Copy link
Member

commented May 8, 2018

I am crawling #hashtags, but would like to exclude a specific owner_username.
Can I do this with --post-filter= ?

Yes, with --post-filter="owner_username != 'USER'".

@aandergr aandergr added the question label May 8, 2018
@aandergr aandergr closed this Jun 4, 2018
@aandergr

This comment has been minimized.

Copy link
Member

commented Jun 4, 2018

or better yet: Download only 1 (the most recent) post from each user.

You could create a set that contains the users of which a post has already been downloaded. When iterating the posts, check whether the post's owner already is in the set. If so, skip the post. Otherwise, download it and add the user to that set:

import instaloader

L = instaloader.Instaloader()

posts = L.get_hashtag_posts('milfgarden')

users = set()

for post in posts:
    if not post.owner_profile in users:
        L.download_post(post, '#milfgarden')
        users.add(post.owner_profile)
    else:
        print("{} from {} skipped.".format(post, post.owner_profile))
aandergr added a commit that referenced this issue Jun 4, 2018
Presents code examples that use the instaloader module for more advanced tasks
than what is possible with the Instaloader command line interface.

Presents #46, #56, #110, #113, #120, #121.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.