Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom AggregatedActivity not working #23

Closed
intellisense opened this issue Jan 16, 2014 · 22 comments
Closed

Custom AggregatedActivity not working #23

intellisense opened this issue Jan 16, 2014 · 22 comments

Comments

@intellisense
Copy link
Contributor

I have created a CustomAggregatedActivity extended from feedly.activity.AggregatedActivity:

from feedly.activity import AggregatedActivity

class CustomAggregatedActivity(AggregatedActivity):
    @property
    def abc(self):
        pass

And then i have used this in my custom aggregator:

class CustomAggregator(BaseAggregator):
    aggregation_class = CustomAggregatedActivity

    def rank(self, aggregated_activities):
        # rank logic here

     def group(self, activity):
         # group logic here

Then i have assigned the CustomAggregator in:

class AggregatedUserFeed(RedisAggregatedFeed):
    aggregator_class = CustomAggregator
    key_format = 'feed:aggregated:%(user_id)s'

and finally:

class Newsfeed(Feedly):
    feed_classes = dict(
        normal=NormalUserFeed,
        aggregated=AggregatedUserFeed
    )
    user_feed_class = UserFeed
newsfeed_stream = Newsfeed()

Now when i get user aggregated feed:

feed = newsfeed_stream.get_feeds(user_id)['aggregated']
aggregated_activities = list(feed[:10])
>>> aggregated_activities[0].abc

It says:

AttributeError: 'AggregatedActivity' object has no attribute 'abc'

Can you tell me why this happened?

Thanks!

@tschellenbach
Copy link
Owner

Yes the serializer class in this case needs to know about which object you use.

# add this serializer class
class MyAggregatedActivitySerializer(AggregatedActivitySerializer):
    aggregated_class = CustomAggregatedActivity
# and point to it
class AggregatedUserFeed(RedisAggregatedFeed):
    aggregator_class = CustomAggregator
    key_format = 'feed:aggregated:%(user_id)s'
    timeline_serializer = MyAggregatedActivitySerializer

That should ensure it returns the right type of object.

At the moment some serializers need to know about the type of object, others don't.
@tbarbugli maybe we can always forward this info from the feed class to the serializer?
What do you think?

@tschellenbach
Copy link
Owner

Added two tickets, we can make it easier to get started with customization of the activity/aggregated activity object and the real time notifications

#24
#25

@intellisense
Copy link
Contributor Author

  1. Do i still need to use CustomAggregatedActivity in CustomAggregator or using it in MyAggregatedActivitySerializer (as you have pointed out) is enough?:

Thanks for quick response. I am working heavily on this so more questions may come in future :)

@tschellenbach
Copy link
Owner

You need both, the serializer doesn't always need it. This serializer is optimized for storage space and doesn't store the object type in the serialization.

@intellisense
Copy link
Contributor Author

Thanks for your help, so far so good. Now stuck again. I am trying to change the criteria to identify duplicate activity, so i override contains method in my CustomAggregatedActivity but it is never called:

from feedly.activity import AggregatedActivity

class CustomAggregatedActivity(AggregatedActivity):
    def contains(self, activity):
        # my logic here for duplicate activity

It is obvious that this should be called but not. My feed is working fine with the custom aggregator and assume that the previous issue is also fixed.

@tbarbugli
Copy link
Collaborator

The contains method should be called by the aggregator class in the merge method.
@tschellenbach do we really need the try/except: pass in the aggregator.merge method ? ( https://github.com/tschellenbach/Feedly/blob/master/feedly/aggregators/base.py#L82 )

@intellisense
Copy link
Contributor Author

Strange it is not called though. For testing purpose I have tried to copy the same contains method written in feedly.activity.py and added in CustomAggregatedActivity:

from feedly.activity import AggregatedActivity

class CustomAggregatedActivity(AggregatedActivity):
    def contains(self, activity):
        '''
        Checks if activity is present in this aggregated
        '''
        print "here" # this never printed
        if not isinstance(activity, Activity):
            raise ValueError('contains needs an activity not %s', activity)
        return activity.serialization_id in set([a.serialization_id for a in self.activities])

Doing so now feed.add_activity(activity, user) just failing silently. I think there is some hidden issue with this and i just can't think where would be the issue.

@intellisense
Copy link
Contributor Author

Forget it, i found it the problem, it was my fault. You can close this issue.

Thanks

@intellisense
Copy link
Contributor Author

I have this group logic for aggregator:

def get_group(self, activity):
        '''
        Returns a group based on the verb and action object_id
        There will/should never be aggregated activities based on different action objects
        All assumptions later on based on the logic that aggregated activities have similar action object
        e.g. User A, User B commented on Action Object
        '''
        verb = activity.verb.id
        object_id = activity.object_id
        group = '%s-%s' % (verb, object_id)
        return group

According to this group logic there will be similar activities from actors but on same action object. There will never be an activity in aggregated activity whose object_id will be different. Now my problem at the moment is, when User B performs activity, User A activity is being removed from aggregated activity and i don't know why.

User A(id:1) and User B(id:2) are friends of User C(id:3)
Action1: User A commented on abc(id: 16)
>>> list(newsfeed_stream.get_feeds(User C)['aggregated'][:10])
[AggregatedActivity(9-16-commented) Actors 1: Objects [16]]
Action2: User B commented on abc
**Expected**
>>> list(newsfeed_stream.get_feeds(User C)['aggregated'][:10])
[AggregatedActivity(9-16-commented) Actors 1,2: Objects [16]]
**Actual Result**
>>> list(newsfeed_stream.get_feeds(User C)['aggregated'][:10])
[AggregatedActivity(9-16-commented) Actors 2: Objects [16]]

Why the previous activity in aggregated activity get lost? Is there any false assumption somewhere that if different activities has similar object_id those should be consider duplicate?

My Duplicate checking logic:

def contains(self, activity):
        '''
        Checks if activity is present in this aggregated
        An activity should only be consider duplicate if:
        1. If actor_id in self.actor_ids AND
        2. activity.verb.past_tence == self.verb.past_tence
        '''
        duplicate = False
        if not isinstance(activity, Activity):
            raise ValueError('contains needs an activity not %s', activity)
        if activity.actor_id in self.actor_ids and activity.verb.past_tence == self.verb.past_tence:
            duplicate = True
        return duplicate

@intellisense
Copy link
Contributor Author

P.S This is the last issue i am facing i will not bother again :)

@tbarbugli
Copy link
Collaborator

Hi,
no problem at all, can you list the name of the methods you override in
your custom aggregator class and aggregated feed ?

2014/1/17 intellisense notifications@github.com

P.S This is the last issue i am facing i will not bother again :)


Reply to this email directly or view it on GitHubhttps://github.com//issues/23#issuecomment-32621125
.

@intellisense
Copy link
Contributor Author

Yes sure here they are:

NewsFeedAggregatedActivity:


class NewsFeedAggregatedActivity(AggregatedActivity):

    @property
    def recent_actor(self):
        activity = self.last_activity
        return User.objects.get(id=activity.actor_id)

    def sanity_check(self):
        # According to aggregator defined in newsfeed.aggregators.NewsFeedAggregator
        # len(self.object_ids) should always be 1
        if len(self.object_ids) != 1:
            raise Exception, "Something is wrong with aggregation"

    def contains(self, activity):
        '''
        Checks if activity is present in this aggregated
        An activity should only be consider duplicate if:
        1. If actor_id in self.actor_ids AND
        2. activity.verb.past_tence == self.verb.past_tence
        '''
        duplicate = False
        if not isinstance(activity, Activity):
            raise ValueError('contains needs an activity not %s', activity)
        if activity.actor_id in self.actor_ids and activity.verb.past_tence == self.verb.past_tence:
            duplicate = True
        return duplicate

Custom Aggregator:

class NewsFeedAggregator(BaseAggregator):
    '''
    Aggregates which should aggregate as "User A, B and C commented on post" 
    '''
    aggregation_class = NewsFeedAggregatedActivity

    def rank(self, aggregated_activities):
        '''
        The ranking logic, for sorting aggregated activities
        '''
        aggregated_activities.sort(key=lambda a: a.updated_at, reverse=True)
        return aggregated_activities

    def get_group(self, activity):
        '''
        Returns a group based on the verb and action object_id
        There will/should never be aggregated activities based on different action objects
        All assumptions later on based on the logic that aggregated activities have similar action object
        So this is very important function
        '''
        verb = activity.verb.id
        object_id = activity.object_id
        group = '%s-%s' % (verb, object_id)
        return group

Activity Serializer:

from feedly.serializers.aggregated_activity_serializer import AggregatedActivitySerializer

from .activity import NewsFeedAggregatedActivity

class NewsFeedAggregatedActivitySerializer(AggregatedActivitySerializer):
    aggregated_class = NewsFeedAggregatedActivity

Feeds:

from feedly.aggregators.base import RecentVerbAggregator
from feedly.feeds.redis import RedisFeed
from feedly.feeds.aggregated_feed.redis import RedisAggregatedFeed
from feedly.feeds.aggregated_feed.notification_feed import NotificationFeed

from .aggregators import NotificationAggregator, NewsFeedAggregator
from serializers import NewsFeedAggregatedActivitySerializer


class UserNotificationFeed(NotificationFeed):
    # : they key format determines where the data gets stored
    key_format = 'feed:notification:%(user_id)s'

    # : the aggregator controls how the activities get aggregated
    aggregator_class = NotificationAggregator


class NormalUserFeed(RedisFeed):
    key_format = 'feed:normal:%(user_id)s'


class AggregatedUserFeed(RedisAggregatedFeed):
    aggregator_class = NewsFeedAggregator
    key_format = 'feed:aggregated:%(user_id)s'
    timeline_serializer = NewsFeedAggregatedActivitySerializer


class UserFeed(NormalUserFeed):
    key_format = 'feed:user:%(user_id)s'

Feedly Feeds:

from django.contrib.auth.models import User

from feedly.feed_managers.base import Feedly
from feedly.feed_managers.base import FanoutPriority
from .models import followers

from .feeds import AggregatedUserFeed, NormalUserFeed, \
    UserFeed


class Newsfeed(Feedly):
    feed_classes = dict(
        normal=NormalUserFeed,
        aggregated=AggregatedUserFeed
    )
    user_feed_class = UserFeed

    def add_activity(self, activity, user):
        # add user activity adds it to the user feed, and starts the fanout
        self.add_user_activity(user.id, activity)

    def remove_activity(self, activity, user):
        # removes the activity from the user's followers feeds
        self.remove_user_activity(user.id, activity)

    def get_user_follower_ids(self, user_id):
        user = User.objects.get(id=user_id)
        ids = [u.id for u in followers(user)]
        return {FanoutPriority.HIGH:ids}

newsfeed_stream = Newsfeed()

@tschellenbach
Copy link
Owner

It seems as though object_id isn't unique. For Fashiolista the object_ids are unique. For commenting.

Comment 1: User A commented on 22 (user a, object id 1, target_id 22)
Comment 2: User B comment on 33 (user b, object id 2, target 33)

I didn't have a chance to have a good look at it just yet, but i have the feeling you swapped the target_id and object_id. Object id is usually unique.

@tschellenbach
Copy link
Owner

PS don't worry, we'll keep on replying till it works :)

@intellisense
Copy link
Contributor Author

Funny+Weird I have tested my aggregator in shell console and it works fine as expected:

>>> from feedly.activity import Activity
>>> from newsfeed.aggregators import NewsFeedAggregator
>>> from newsfeed.verbs import CommentVerb
>>> a = []
>>> a.append(Activity(1, CommentVerb, 1))
>>> a.append(Activity(2, CommentVerb, 1))
>>> print a
[Activity(commented) 1 1, Activity(commented) 2 1]
>>> aggregator = NewsFeedAggregator()
>>> aggregated_activities = aggregator.aggregate(a)
>>> print aggregated_activities
[AggregatedActivity(7-1-commented) Actors 1,2: Objects [1]]

So i think the problem is here newsfeed_stream.add_activity(activity, user) or Feedly.add_user_activity

@intellisense
Copy link
Contributor Author

As you can see in above testing the object_id is not unique still it aggregates properly. I have tried it passing a same creation time as well it works. So i am wondering why add_activity acting differently?

@tschellenbach
Copy link
Owner

Think I found it :)
Try the command line version with merge instead of aggregate.
(The add many calls aggregator.merge)

# in aggregated feed.add_many
new, changed, deleted = aggregator.merge(
            current_activities, activities)

@intellisense
Copy link
Contributor Author

I have tried the merge, here are the results, assuming we have aggregated activities from my previous testing code aggregated_activities = aggregator.aggregate(a):

a3 = Activity(3, CommentVerb, 1)
>>> new, changed, deleted = aggregator.merge(aggregated_activities, [a3])
>>> print new
[]
>>> print changed
[(AggregatedActivity(9-1-commented) Actors 1,2: Objects [1], AggregatedActivity(9-1-commented) Actors 1,2,3: Objects [1])]
>>> print deleted
[]

I can't see the problem here as well :(

@intellisense
Copy link
Contributor Author

I have achieved this so far with your help guys :)

screenshot from 2014-01-17 23 49 27

If you guys help me in fixing the above issue (the one last issue i am having), i would be able to do this:

screenshot from 2014-01-17 23 56 46

@intellisense
Copy link
Contributor Author

Thanks i have fixed it :)

@xsanch
Copy link

xsanch commented Feb 2, 2014

@intellisense can you just point where the issue was ? I am now studying this example as I would like to achieve simmmilar results.

Regards,

Jorge

@intellisense
Copy link
Contributor Author

@xsanch i have addressed multiple issues here, can you please point to the specific issue so that i can help? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants