Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with duplicates on import #446

Closed
wants to merge 10 commits into from
Closed

Deal with duplicates on import #446

wants to merge 10 commits into from

Conversation

tim-pearce
Copy link
Contributor

If a person of the same name exists, they are assumed to be the same person and will not be updated except to be added to the group nominated in the import.
Other people being imported as the same family will be added to the existing family.
This and explanation about default group status added to the explanation.
At the end it reports about existing members and families so you can make corrections. (I had one case where a child had been brought along as part of two different families).
This addresses #441.

@tbar0970
Copy link
Owner

tbar0970 commented Feb 7, 2018

Thanks for this. I've looked through it and have a few thoughts:

  1. We should make this duplicate-matching behaviour optional - a checkbox on the import form "re-use existing persons with matching first and last name". Because in my current church we actually have two different people with the same name.
  2. If it matches an existing person record I reckon it really should update its details from the import file. It could well be that the existing person has the name but no contact details, but the file does have contact details. We wouldn't want that data to be lost. I think it's fair to overwrite the existing record's values with non-blank values from the import file (but if a field is empty in the import file, we shouldn't wipe out an existing value in the existing person record).
  3. Right now it matches archived person records too. I guess this is fair enough but if you're importing them you probably don't want them to be archived any more. I guess the previous point will solve this - their status will be updated to whatever is in the import file.
  4. Just as a thought: maybe we should also/alternatively match on fields like email or mobile number? Maybe something for later.
  5. Can you have another go at creating a clean pull request with just the changes for this issue. I'm not a git expert, but I think it might be that you need to do a git rebase on your branch rather than a git merge. Or if it was me I think I would probably just create a clean branch and manually apply the changes there.

@tim-pearce
Copy link
Contributor Author

These changes have already served me fine.

My 'use case' is importing from a number email contact lists (a play group and two youth groups) where I expect duplicates. In addition, families register new each year, so when I add the to the 2018 groups I can make any corrections. Having them pre-added saves me a lot of data entry and captures the history of their attendance in previous years.

I agree that duplicate matching and archived member handling should be options.

Things other than name (and for adults, surname may change as well!) are prone to change. The instance of two people with the same name is an 'exception'. In your case you have just one and the import would tell you it found a duplicate and thus you could correct it.

Do you have any idea about the 'normal' 'use case' for the import function? I would expect it to be similar to mine.

BTW The export (from Roundcube) was in VCF format so I hacked a vcf to csv converter (php - it's on the other computer so don't have the name to hand). It's tailored to my data, but I can provide a plain vanilla version if anyone wants it.

Tim

@tbar0970
Copy link
Owner

Hi Tim

The common use case so far was initial system setup and migrating from some other database. But your use case is also a good one to cover and opens up new possibilities.

It sounds to me like the update-details behaviour I suggest would suit your scenario better. If somebody has re-registered for playgroup this year and has provided either updated details or more complete details, wouldn't you want those details to be pulled into Jethro automatically?

Are you happy to go ahead and add the extra checkbox to enable/disable duplicate matching, and to create a clean pull request with just the changes for this feature?

Tom

@tim-pearce
Copy link
Contributor Author

Tom,

Yes, happy enough to do that. Busy administrating 3 youth ministries plus plenty of other stuff, so probably won't be too immediate.

The email address books have effectively been used as a database, so the current information at time of import was the most recent. If I did it again I would probably import the older group first as families moving through 'the system' are more likely to end in the older groups.

Tim

@tim-pearce tim-pearce closed this Feb 27, 2018
@tim-pearce
Copy link
Contributor Author

Oops, didn't really mean to close that!

@tim-pearce tim-pearce reopened this Feb 27, 2018
@tbar0970
Copy link
Owner

I had a look through this again, and have been working on a more full-blown solution that also updates person/family fields etc. See Issue #441 . Closing this pull request.

@tbar0970 tbar0970 closed this Nov 30, 2018
@tim-pearce tim-pearce deleted the import branch September 14, 2021 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants