Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] export all likes on a profile #120

Closed
nikartz opened this issue May 16, 2018 · 7 comments

Comments

@nikartz
Copy link

commented May 16, 2018

Is there a way to export all profiles, that liked pictures on a given profile? I'm thinking of a way similar to this approach for followers.
I've played around with get_likes from structures.py, but I've had no luck.
Maybe someone could help!

If anyone is interested: I am trying to get a list of people, who follow someone (done, thanks to get_followers or get_followees) and compare that list to all profiles, that liked a picture (or maybe a picture in the last 6 months or so). This way I want to filter all followers, that haven't liked anything (=ghost followers) and manually block them (I don't want to block someone automated, so that I can really select on which profiles to block).

@nikartz

This comment has been minimized.

Copy link
Author

commented May 19, 2018

So I now found a working solution and want to share it with anyone interested:

I now use InstaPy to export alle the likes on my profile. If set the quickstart.py up with:

try:
	session.login()
	session.set_dont_unfollow_active_users(enabled=True, posts=1000, boundary=50000)
	session.unfollow_users(amount=0, onlyInstapyFollowed = True, onlyInstapyMethod = 'FIFO', sleep_delay=600)

which works for me to output a list of profiles that liked my images. Of course the output needs to be logged somewhere and in the instapy.py you need to add print(active_users) somewhere around line 1977 (at least in my case) in def set_dont_unfollow_active_users so that it looks something like this:

def set_dont_unfollow_active_users(self, enabled=False, posts=4, boundary=500):
        """Prevents unfollow followers who have liked one of
        your latest X posts"""

        # do nothing
        if not enabled:
            return

        # list of users who liked our media
        active_users = get_active_users(self.browser,
                                        self.username,
                                        posts,
                                        boundary,
                                        self.logger)

        print(active_users)

        for user in active_users:
            # include active user to not unfollow list
            self.dont_include.append(user)

After that the output file should contain a list of all profiles that liked your posts.

No head on to outputting all followers. Again the output has to be logged somewhere, but my instaloader (I made a file called export_followers and threw it in the instaloader folder) looks like this:

import instaloader
import time

L = instaloader.Instaloader()

USER = 'your_account'
PASSWORD = 'your_password'
PROFILE = USER

L.login(USER, PASSWORD)

profile = instaloader.Profile.from_username(L.context, PROFILE)

for follower in profile.get_followers():
	print(follower.username)

That should output a file containing all your followers each in a new line.

That file I take to word and automatically replace every newline character with ', ' which means I search for ^p and replace it with ', '.
After that there has to be done some cleaning, like adding [' at the beginning and taking a look at the beginning and the end if everything looks like a proper python list.

Those two lists I take into another simple python-script, which compares two lists and outputs everything that doesn't match up into a file. The script looks like this:

#list of followers:
follower = []

#list of likes:
liker = []

#compare
follower.sort()
liker.sort()

matches_literal = [set(follower) & set(liker)]

matches = str(matches_literal).replace('{','').replace('}','')

print('All matches:')
print(matches)
print()
print()
print('Inactive followers:')

ban = [n for n in follower if n not in liker]

print(ban)

print(ban, file=open("/YOUR PATH/inactive-users.txt", "w"))

print()
print()
print('Done')

Where ist says follower = [] and liker = [] you of course need to add your own lists, that have been outputted.

This way there after all of that there is a file containing all inactive users, that follow you. Now I go ahead and manually decide, if I want to ban them in order to get rid of ghost followers.

Maybe this approach is helpful to someone. I know, that it is a bit of work required, but I a beginner at python and couldn't automate the process more.

@Thammus

This comment has been minimized.

Copy link
Member

commented May 24, 2018

Hello nikartz,
your goal can easily be achieved using Instaloader only. No need to use other python modules or text editing software. To store inactive followers into a file you can use following approach:

import instaloader

L = instaloader.Instaloader()

USER = 'your_account'
PROFILE = USER

# Your preferred way of logging in:
L.load_session_from_file(USER)

profile = instaloader.Profile.from_username(L.context, PROFILE)

likes = set()
print('Fetching likes of all posts of profile {}.'.format(profile.username))
for post in profile.get_posts():
    print(post)
    likes = likes | set(post.get_likes())

print('Fetching followers of profile {}.'.format(profile.username))
followers = set(profile.get_followers())

ghosts = followers - likes

print('Storing ghosts into file.')
with open('/YOUR PATH/inactive-users.txt', 'w') as f:
    for ghost in ghosts:
        print(ghost.username, file=f)
@Thammus Thammus added the question label May 24, 2018
aandergr added a commit that referenced this issue Jun 4, 2018
Presents code examples that use the instaloader module for more advanced tasks
than what is possible with the Instaloader command line interface.

Presents #46, #56, #110, #113, #120, #121.
@LoreKeeperKen

This comment has been minimized.

Copy link

commented Jun 8, 2018

@Thammus is there a way to have instaloader only pull likes from the latest five posts? Like @nikartz is utilizing instapy to do?

I'm trying to compare likes to ghosts on an account with 500 posts. It obviously takes a very long time to scrape and likes that are older than a couple weeks are stale and don't really prove currently active users.

@aandergr

This comment has been minimized.

Copy link
Member

commented Jun 8, 2018

@Thammus is there a way to have instaloader only pull likes from the latest five posts? Like @nikartz is utilizing instapy to do?

Sure. profile.get_posts() returns an iterator, which can be sliced with islice() from itertools. So instead of

for post in profile.get_posts():
    ...

you can use

from itertools import islice
for post in islice(profile.get_posts(), 5):
    ...

You can also use the post's age as stop condition, where takewhile() comes handy. For example,

from datetime import datetime, timedelta
from itertools import takewhile
NOW = datetime.now()
for post in takewhile(lambda p: NOW - p.date < timedelta(days=7), profile.get_posts()):
    ...
@LoreKeeperKen

This comment has been minimized.

Copy link

commented Jun 8, 2018

Thanks @aandergr for so many options. I'll try them out. :)

@LoreKeeperKen

This comment has been minimized.

Copy link

commented Jun 9, 2018

The islice() worked perfectly for finding recent activity.

I would like to have a list of all likes also as in Thammus' original example. But my profile has 500 posts and when I run .get_posts() it always errors out with the 429 too many requests halfway-ish through. How do I insert longer wait times between requests? Or is there a better way to prevent the 429 errors? I have no other instances of instaloader or anything related to instagram running on this machine or even from the same IP.

@aandergr

This comment has been minimized.

Copy link
Member

commented Jun 11, 2018

A general note about the notorious 429 - Too Many Requests: Instaloader has a logic to keep track of its requests to Instagram and to obey their rate limits. Since they are nowhere documented, we try them out experimentally. We have a daily cron job running to confirm that Instaloader still stays within the rate limits. Nevertheless, the rate control logic assumes that

  • at one time, Instaloader is the only application that consumes requests. I.e. neither the Instagram browser interface, nor a mobile app, nor another Instaloader instance is running in parallel,
  • no requests had been consumed when Instaloader starts.

The latter one implies that restarting or reinstantiating Instaloader often within short time is prone to cause a 429. When a request is denied with a 429, Instaloader retries the request as soon as the temporary ban is assumed to be expired.

(copy of my recent comment in #128 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.