No module named 'splink.duckdb.linker #1920

theimanph · 2024-02-01T20:56:38Z

theimanph
Feb 1, 2024

Hi,

Cool project!! When I try to run the demo code below I get No module named 'splink.duckdb.linker. Any ideas on what is going wrong? The code is below. Thank you!

tom

from splink.duckdb.linker import DuckDBLinker
import splink.duckdb.comparison_library as cl
import splink.duckdb.comparison_template_library as ctl
from splink.duckdb.blocking_rule_library import block_on
from splink.datasets import splink_datasets
import logging, sys
logging.disable(sys.maxsize)

df = splink_datasets.fake_1000

settings = {
"link_type": "dedupe_only",
"blocking_rules_to_generate_predictions": [
block_on("first_name"),
block_on("surname"),
block_on("email")
]
}

linker = DuckDBLinker(df, settings)

linker.cumulative_num_comparisons_from_blocking_rules_chart()

RobinL · 2024-02-01T21:25:34Z

RobinL
Feb 1, 2024
Maintainer

What do you get if you run splink.__version__?

4 replies

theimanph Feb 1, 2024
Author

I am using jupyterlab on Posit. I installed it and got: NameError: name 'splink' is not defined.. Interesting..

RobinL Feb 1, 2024
Maintainer

Difficult to tell but it sounds like for some reason it hasn't installed into the right place and so python can't see it

theimanph Feb 2, 2024
Author

It ended up being due to the version of python. It was 3.6.. I made a new environment using 3.11 and it works perfectly!! Thank you!!

RobinL Feb 2, 2024
Maintainer

thanks for letting us know!

theimanph · 2024-02-12T13:55:27Z

theimanph
Feb 12, 2024
Author

Hi Robin, Silly question for you: Have you seen something like this screen shot below before? It doesn’t really look like any of the fields is particularly useful.. Am I missing something? Is there anything that one can do? Thank you!! Sincerely, Tom My code is below the picture ***@***.*** from splink.duckdb.blocking_rule_library import block_on blocking_rules = [ block_on("lastname_dm"), block_on("PER_LastName"), block_on("new_address"), #block_on("full_name"), block_on("firstname_dm"), block_on("PER_FirstName"), block_on("PER_DOB"), block_on("zip_valid"), block_on("PER_HomePhone"), block_on("PER_Email"), block_on("PER_CellularPhoneOrPager"), block_on("middlename_dm") ] import splink.duckdb.comparison_template_library as ctl from splink.duckdb.linker import DuckDBLinker import splink.duckdb.comparison_library as cl import splink.duckdb.comparison_library as cl import splink.duckdb.comparison_template_library as ctl settings = { "unique_id_column_name": "id", "link_type": "dedupe_only", "blocking_rules_to_generate_predictions": blocking_rules, "comparisons": [ ctl.name_comparison("PER_FirstName"), ctl.name_comparison("PER_LastName"), ctl.name_comparison("full_name"), ctl.email_comparison("PER_Email", include_username_fuzzy_level=False), ctl.forename_surname_comparison("PER_FirstName","PER_LastName"), cl.levenshtein_at_thresholds("PER_SSN", [2]), cl.exact_match("new_address", term_frequency_adjustments=True), ctl.date_comparison("PER_DOB"), cl.exact_match("city_valid", term_frequency_adjustments=True), cl.exact_match("zip_valid", term_frequency_adjustments=True), ], "retain_intermediate_calculation_columns": True } linker = DuckDBLinker(cleaned_df9_subset, settings, set_up_basic_logging=False) deterministic_rules = [ "l.PER_SSN = r.PER_SSN and l.PER_DOB = r.PER_DOB", "l.PER_FirstName = r.PER_FirstName and l.PER_DOB = r.PER_DOB", "l.PER_LastName = r.PER_LastName and l.PER_DOB = r.PER_DOB", ] linker.estimate_probability_two_random_records_match(deterministic_rules, recall=0.9) linker.estimate_u_using_random_sampling(max_pairs=1e8) linker.estimate_m_from_label_column("PER_SSN") and that gives me a full model and I then use linker.match_weights_chart() From: Robin Linacre ***@***.***> Sent: Friday, February 2, 2024 11:31 AM To: moj-analytical-services/splink ***@***.***> Cc: Thomas Heiman ***@***.***>; Author ***@***.***> Subject: Re: [moj-analytical-services/splink] No module named 'splink.duckdb.linker (Discussion #1920) CAUTION: External Email. Proceed Responsibly. Closed #1920<#1920> as resolved. — Reply to this email directly, view it on GitHub<#1920>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BETOF4JCFS4GWHGM2ZNBT2LYRU5IFAVCNFSM6AAAAABCVSRDGKVHI2DSMVQWIX3LMV45UABFIRUXGY3VONZWS33OIV3GK3TUHI5E433UNFTGSY3BORUW63R3GEYDQOJTG44Q>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No module named 'splink.duckdb.linker #1920

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

No module named 'splink.duckdb.linker #1920

theimanph Feb 1, 2024

Replies: 2 comments · 4 replies

RobinL Feb 1, 2024 Maintainer

theimanph Feb 1, 2024 Author

RobinL Feb 1, 2024 Maintainer

theimanph Feb 2, 2024 Author

RobinL Feb 2, 2024 Maintainer

theimanph Feb 12, 2024 Author

theimanph
Feb 1, 2024

Replies: 2 comments 4 replies

RobinL
Feb 1, 2024
Maintainer

theimanph Feb 1, 2024
Author

RobinL Feb 1, 2024
Maintainer

theimanph Feb 2, 2024
Author

RobinL Feb 2, 2024
Maintainer

theimanph
Feb 12, 2024
Author