Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command to deduplicate contacts #5344

Merged
merged 24 commits into from May 25, 2018

Conversation

@alanhartless
Copy link
Contributor

alanhartless commented Nov 17, 2017

Q A
Bug fix?
New feature? y
Automated tests included? y
Related user documentation PR URL todo
Related developer documentation PR URL
Issues addressed (#s or URLs)
BC breaks?
Deprecations?

Description:

This PR adds a command php app/console mautic:contacts:dedup to find contacts with the same unique identifiers and merge them. By default, the records are merged from oldest into newest but that can be changed with the flag --newest-into-oldest.

Regardless of which record is merged into which, the latest modified record's custom field data is used (except where it is empty as we will error on the side of not completely losing data). Also the latest last active date is preserved and the oldest date identified date.

Contacts with conflicting unique identifiers when there are more than one unique identifier are not merged.

Steps to test this PR:

  1. Hack the database to update multiple contacts with the same email OR manually edit through the UI to ensure there are multiple contacts with the same unique identifiers.
  2. Run the command and they should be merged along with all of their behavior data.

List deprecations along with the new alternative:

  1. LeadModel::mergeLeads is deprecated. ContactMerger should be used instead
@alanhartless alanhartless changed the title Command to deduplicate contacts WIP - Command to deduplicate contacts Nov 17, 2017
@dongilbert

This comment has been minimized.

Copy link
Member

dongilbert commented Nov 17, 2017

+1 works well. Tentative approval pending UT's

@javjim

This comment has been minimized.

Copy link
Contributor

javjim commented Nov 17, 2017

works properly for me

@alanhartless alanhartless force-pushed the alanhartless:feature-dedup-command branch 2 times, most recently to 9ab8020 Nov 18, 2017
@luizeof

This comment has been minimized.

Copy link

luizeof commented Dec 7, 2017

@alanhartless .... 2.12.0 fix duplicating contacts on import .... this pull will be released on 2.12.0 too?

Thanks ;)

@vesper8

This comment has been minimized.

Copy link

vesper8 commented Apr 7, 2018

I need to use this, I have a bunch of duplicate contacts and am looking for a safe way to merge them

2.12 came out already but this didn't make it in? How can I use this now? I must create a fork or is there another way?

@alanhartless alanhartless force-pushed the alanhartless:feature-dedup-command branch from 123476f to 5ed1e32 Apr 17, 2018
@alanhartless alanhartless added this to the 2.14.0 milestone Apr 17, 2018
@alanhartless alanhartless changed the title WIP - Command to deduplicate contacts Command to deduplicate contacts Apr 19, 2018
@alanhartless alanhartless force-pushed the alanhartless:feature-dedup-command branch from 21bb42f to f103a5b Apr 19, 2018
protected function execute(InputInterface $input, OutputInterface $output)
{
/** @var ContactDeduper $deduper */
$deduper = $this->getContainer()->get('mautic.lead.deduper');

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

Can we use DI for new commands?

use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;
class DedupCommand extends ModeratedCommand

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

Is the dedup real word in English? I think DeduplicateCommand would be a better name?

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

Probably more slang but I can rename it.

/**
* @var ContactMerger
*/
private $merger;

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

Any reason why not to use $contactMerger?

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

No reason why not. I'll change it.

/**
* @var LeadRepository
*/
private $repository;

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

$leadRepository?

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

No reason why not. I'll change it.

namespace Mautic\LeadBundle\Exception;
class ValueNotMergeable extends \Exception

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

Should be ValueNotMergeableException

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

👍

*/
public function __construct($newerValue, $olderValue)
{
parent::__construct(var_export($newerValue, true).' / '.var_export($olderValue, true));

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

Why is var_export here? Should be a comment here I think, if there is a real reason.

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

Just to add in the message what the value actually was. $newerValue can be anything so var_export converts it to a string for the exception message.

*
* @return Lead
*/
public function mergeLeads(Lead $lead, Lead $lead2, $autoMode = true)

This comment has been minimized.

Copy link
@Maxell92

Maxell92 Apr 26, 2018

Contributor

Why is this change here? I think this method just should call the new method, right?

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

It was relocated from elsewhere in the class. But there is a circular dependency if I use the new service here. Will have to create some kind of legacy class just to get around that.

This comment has been minimized.

Copy link
@alanhartless

alanhartless Apr 26, 2018

Author Contributor

I had to address this by passing the container to a legacy class to keep from having the circular dependency. Will likely need to leverage this class for some other methods in the LeadModel to remove more deprecated code/circular dependencies. I want to get rid of checkForDuplicateContact as well and use the ContactDeduper class instead.

@alanhartless alanhartless force-pushed the alanhartless:feature-dedup-command branch from bdc07ef to 3a7151a Apr 26, 2018
Copy link
Contributor

Maxell92 left a comment

Works for me

@mautibot mautibot added the Code Review label May 8, 2018
@escopecz escopecz self-assigned this May 8, 2018
Copy link
Member

escopecz left a comment

The code looks nice. Will test it. Just found minor issues in the doc blocks.

<?php
/*
* @copyright 2017 Mautic Contributors. All rights reserved

This comment has been minimized.

Copy link
@escopecz

escopecz May 8, 2018

Member

should read 2018. Also check other file annotations.

/*
* @copyright 2017 Mautic Contributors. All rights reserved
* @author Mautic, Inc.

This comment has been minimized.

Copy link
@escopecz

escopecz May 8, 2018

Member

Shouldn't be Inc, or should it? Also check other file annotations.

This comment has been minimized.

Copy link
@alanhartless

alanhartless May 9, 2018

Author Contributor

Well technically it is Mautic, Inc contributing this code. So I think this is fine.

This comment has been minimized.

Copy link
@escopecz

escopecz May 9, 2018

Member

Do we want every company/individual who creates/modifies the code to add itself as the author? It says "Mautic" on all files I randomly checked. Maybe @dbhurley could way in.

This comment has been minimized.

Copy link
@escopecz

escopecz May 9, 2018

Member

I just saw Mautic, Inc. on another file from 2014. So if that's not new, let's ignore this discussion.

* @param LeadModel $leadModel
* @param MergeRecordRepository $repo
* @param LoggerInterface $logger
*/

This comment has been minimized.

Copy link
@escopecz

escopecz May 8, 2018

Member

Dispatcher param is missing in the docblock

<?php
/*
* @copyright 2017 Mautic Contributors. All rights reserved

This comment has been minimized.

Copy link
@escopecz

escopecz May 9, 2018

Member

Sorry to be PITA, but could you check all the file annotations?

This comment has been minimized.

Copy link
@alanhartless

alanhartless May 9, 2018

Author Contributor

I think I technically wrote these files in 2017 :-)

@mautibot mautibot removed the Code Review label May 9, 2018
Copy link
Member

escopecz left a comment

Works fine for me. Thanks Alan!

@dbhurley dbhurley added Backlog and removed Ready To Commit labels May 18, 2018
@dbhurley dbhurley removed this from the 2.14.0 milestone May 18, 2018
@alanhartless alanhartless removed the Backlog label May 25, 2018
@alanhartless alanhartless added this to the 2.14.0 milestone May 25, 2018
@alanhartless alanhartless merged commit a3427ee into mautic:staging May 25, 2018
2 checks passed
2 checks passed
Scrutinizer 5 new issues, 57 updated code elements
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.