Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting hat package creates identity with missing information when merged and unmerged #243

Open
code-sleuth opened this issue Feb 13, 2020 · 6 comments

Comments

@code-sleuth
Copy link

When we merge one profile with another, we call sortinghat.api.merge_unique_identities (https://github.com/LF-Engineering/dev-analytics-sortinghat-api/blob/master/app/apis/profiles/apis.py#L190), with a from and to uuid. The from identity is then deleted and added into the to identity.

Afterwards, if we unmerge that identity from the to identity, for which we use sortinghat.api.move_identity (https://github.com/LF-Engineering/dev-analytics-sortinghat-api/blob/master/app/apis/profiles/apis.py#L269), the from identity that was earlier deleted is recreated but it will be missing name, email and other personal details information it previously had.

@sduenas
Copy link
Member

sduenas commented Feb 13, 2020

Current version of SortingHat doesn't not track historic information to that level of detail, so it's not possible to recreate the previous identity. With the new experimental version (see muggle branch) that would be possible because there's a table which stores all the changes in the identities but currently, there's no code to do that. Not sure if it's something we should support.

What's your use case? Why do you need this feature?

@code-sleuth
Copy link
Author

code-sleuth commented Feb 14, 2020

What's your use case?

If there's two identities with let's say two emails that are similar, i think in that case it warrants merging and in case a user thinks the merge was a mistake then they can unmerge.

Why do you need this feature?

Basically giving users the ability to unmerge an identity if they mistakenly merge two identities.

@lukaszgryglicki
Copy link

I see that the differences between mungle and master are huge.
@sduenas Is it safe to use sortingaht based on mungle branch?
Can you please point me to DB structure differences that are required to handle this?
I've generated a diff file but it is so huge that it is hard to track the actual changes needed in DB structure. Any chances that you create another branch with unmerging support but rebased to current sortinghat master branch?

Here is the diff file mungle-master.diff.txt
cc @code-sleuth

@sduenas
Copy link
Member

sduenas commented Feb 20, 2020

@lukaszgryglicki, muggle branch is a totally different thing. It's still experimental and not integrated with any other component in the stack. You should not use that branch unless you want to contribute developing it. You have more info about it here: https://github.com/chaoss/grimoirelab-sortinghat/wiki/Roadmap-to-Sorting-Hat-1.0

I can try to rebase the branch to master but as they are incompatible I don't see the point of doing it right now.

@lukaszgryglicki
Copy link

OK, thanks for the info. How about changes to DB structure needed to handle merge/unmerge operations?

@sduenas
Copy link
Member

sduenas commented Feb 20, 2020

In muggle we use Django ORM and not SQLAlchemY - as in master - to deal with the database, so it's not only about DB structure. If you want to check it is in here

In any case, muggle doesn't implement what I think you want, which is an "undo" of certain operations. It's something that it's possible to implement with the current schema because we store all the operations done with SortingHat. So, using event sourcing pattern would be possible to recreate some states or to roll back. This is not implemented yet and I'm not sure if we should follow that direction.

I'm also open to ideas about how to manage these cases plus having PRs solving this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants