Skip to content
Pro
Block or report user

Report or block fgregg

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse

Organizations

@open-city @datamade @dssg
Block or report user

Report or block fgregg

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse

Pinned

  1. 🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

    Python 2.6k 389

  2. 🇺🇸 a python library for parsing unstructured address strings into address components

    Python 1.1k 210

  3. 🔖 A toolkit for making domain-specific probabilistic parsers

    Python 683 70

  4. 🆔 Command line tool for deduplicating CSV files

    Python 285 68

  5. Estimating Markov Random Fields models with Pseudolikelihood

    Python 1 1

  6. 👪 a python library for parsing unstructured western names into name components.

    Python 394 60

800 contributions in the last year

Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Mon Wed Fri

Contribution activity

June 2020

fgregg has no activity yet for this period.

May 2020

Created a pull request in dedupeio/dedupe that received 1 comment

keep in memmap

This PR adjusts connected components so as to use get the components of by means of slices of the the memmaped array. This keeps the array mainly o…

+120 −26 1 comment

Created an issue in dedupeio/dedupe that received 10 comments

disk based connected components

The current scaling bottleneck is usually the connected components step in clustering. The current algorithm loads the entire edgelist into memory. W…

10 comments

Seeing something unexpected? Take a look at the GitHub profile guide.

You can’t perform that action at this time.