Skip to content

Legacy Data Issue: adding all contributor_alias.canonical_email that are not in contributor_alias.alias_email #269

@cdolfi

Description

@cdolfi

Now, collectoss adds all emails from the facade task to the contributor_alias.alias_email column. For databases that existed before collectoss there needs to be row added for all contributor_alias .canonical_email that are not in the contributor_alias.alias_email

Potential query:

INSERT INTO augur_data.contributors_aliases (
    alias_email,
    canonical_email,
    cntrb_active,
    cntrb_last_modified,
    tool_source,
    tool_version,
    data_source,
    data_collection_date,
    cntrb_id
)
SELECT
    ca.canonical_email AS alias_email,
    ca.canonical_email AS canonical_email,
    ca.cntrb_active,
    NOW() AS cntrb_last_modified,
    'Manual' AS tool_source,
    '0.0.1' AS tool_version,
    'Manual Entry' AS data_source,
    NOW() AS data_collection_date,
    ca.cntrb_id
FROM augur_data.contributors_aliases ca
WHERE NOT EXISTS (
    SELECT 1
    FROM augur_data.contributors_aliases ca2
    WHERE ca2.alias_email = ca.canonical_email
)
AND ca.alias_email LIKE '%@%';

Sub/related issue: #268

Note: this is different from #270 as that one is for contributors.canonical_email and this one is for contributor_alias.canonical_email

If email is added manually it may need to get removed from the unresolved table"

DELETE FROM augur_data.unresolved_commit_emails
WHERE email IN (
    SELECT alias_email
    FROM augur_data.contributors_aliases
);

Metadata

Metadata

Assignees

No one assigned

    Labels

    databaseRelated to the unifed data model/schema

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions