Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create mapping for item_features and build item_features if the feature contains multiple values for each item? #330

Closed
Ivanclj opened this issue Jul 13, 2018 · 26 comments

Comments

@Ivanclj
Copy link

Ivanclj commented Jul 13, 2018

Hi,

I was following the example of Building datasets and tried to build the movielens dataset myself. However, when I tries to build the mapping for item and item features, I realized that the feature I choose, which is the genres in movies.csv, has multiple values in one cell (eg. Action|Fantasy|Comedy) instead of just one value. How should I use dataset.fit_partial to create the mapping in these case? And the same goes for building interactions. Thank you so much!

@maciejkula
Copy link
Collaborator

maciejkula commented Jul 13, 2018

Have you had a look at the docs and the example?

In particular, have a look at the docs for building features to understand what arguments you need to pass in to allow multiple features for each item: http://lyst.github.io/lightfm/docs/lightfm.data.html#lightfm.data.Dataset.build_item_features. To create mappings, you simply pass all possible features in sequentially.

@hammadkhann
Copy link

Hey I have a list of tags in a dataframe column against each entity and same goes with categories how do I feed it into the build_item_features function kindly help I am confused?

@maciejkula
Copy link
Collaborator

Have you read the documentation? What have you tried that didn't work? If you give me a couple of examples I'll be able to improve the documentation for future users.

@maciejkula
Copy link
Collaborator

maciejkula commented Jul 14, 2018

This doesn't look right! Should we try to figure out constructing the interaction matrix first? Do you want to walk me through what you are trying to do and what you expect to happen?

(Can you also try using Markdown formatting for code blocks? It makes everything much easier to read.)

@hammadkhann
Copy link

Can we talk some where else?? Linkedin or somewhere?

@maciejkula
Copy link
Collaborator

I think I prefer to do it here.

@hammadkhann
Copy link

Ok then lets start with user interaction matrix the idea was to recommend user the brands or entities of particular category in which the user is searching to make interaction matrix I use the user activity data in which if he views some entity details and he have interacted with it similarly if user have wishlist some entity so he likes it. So the above user_interaction dataframe have been created using these two measures.

@hammadkhann
Copy link

Is this the right way to make interaction matrix?

@maciejkula
Copy link
Collaborator

This makes sense. How do you build a dataset from it?

@hammadkhann
Copy link

we have our data on elastic search I did a query for rest call which was responsible for viewing entity details so it gives us the entity id and user id who has viewed it, and wishlist data was queried by user profile data.

@maciejkula
Copy link
Collaborator

I mean, how you create the sparse interaction matrix using the Dataset class?

@hammadkhann
Copy link

right now I am doing this using entity id and user id from the dataframe I have built I have also attached the snapshot above of named user_interaction dataframe

(interactions, weights) = dataset.build_interactions(((x['user_id'], x['entity_id'])
for index,x in user_interaction.iterrows()))

@maciejkula
Copy link
Collaborator

Please use Markdown code blocks.

This looks OK; does the resulting matrix have the shape and density you expect?

@hammadkhann
Copy link

I am fitting my data like this:
is it right to take entity ids from entity_data and not from user_interaction? because not all the entities id will be present in the user_interaction data ?
dataset = Dataset() dataset.fit((i['user_id'] for index,i in user_interaction.iterrows()), (i['entity_id'] for index,i in entity_data.iterrows()), item_features = entity_final[feat] )

@hammadkhann
Copy link

yes it has same shape and density I expected.

@maciejkula
Copy link
Collaborator

Have a look at the Markdown guide to understand how to do code blocks.

I'm sorry, I am not quite sure what your question is.

@hammadkhann
Copy link

I am sorry let me see it

@hammadkhann
Copy link

dataset = Dataset() 
dataset.fit((i['user_id'] for index,i in user_interaction.iterrows()), (i['entity_id'] for index,i in entity_data.iterrows()), item_features = entity_final[feat] )

Thats how I am fitting my data. Anyways now tell me how can I make my item_feature_matrix? I have attached the dataframe screenshot above name of the dataframe is entity_final.

@maciejkula
Copy link
Collaborator

maciejkula commented Jul 14, 2018

OK. You need to pass an iterable of (entity_id, [features of that entiity]) tuples.

Assuming (for example) your features are in the Tags column, you could something akin to this:

features = [(x['entity_id'], x['Tags'].split(', ')) for x in dataframe.iterrrows()]
dataset.build_item_features(features)

Remember you have to pass the features names to fit before running this so that feature name mappings are constructed first.

@hammadkhann
Copy link

Below is the code through which I am making item feature dataset right now

entity_features=list(entity_final.columns)
#deleting entity id from entity features list.
del entity_features[1]
#building item features dataset
item_features = dataset.build_item_features(((x['entity_id'], feat) 
                              for index,x in entity_final.iterrows()))
print(item_features)

@hammadkhann
Copy link

Yes I know that thats why I have passed feature names at the time of fitting the dataset

@maciejkula
Copy link
Collaborator

Does it work? If not, can you try what I suggested?

@hammadkhann
Copy link

It is compiling but not giving me good results and also I am not sure what is happening underneath with these features thanks for your help let me try your thing.

@hammadkhann
Copy link

If I have to use multiple features of an entity like with tags I also use entity rank and rating as a feature so how will I pass it?

@maciejkula
Copy link
Collaborator

You have to combine all features for a given entity into a single list.

@hammadkhann
Copy link

Ok thankyou Its working great now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants