Skip to content

random effects model for combining barcodes #56

@nickzoic

Description

@nickzoic

We want to be able to use the random effects model for combining barcodes as if they were very small replicates of their own.

The current pivot-and-transform technique isn't suitable for this but we could just register the existing rml_estimate function as a UDF and do something like:

select random_effects_model(list(score), list(sigma)) from {view.alias} group by {group_column}

eg:

c = duckdb.connect()
from countess.plugins.random_effects import rml_estimate
c.create_function("rml", rml_estimate, return_type="DOUBLE[]")
c.sql("create table a (a integer, b integer, c float, d float)");
c.sql("insert into a values (1,1,3.0,1.0), (1,2,2.0,1.5), (2,1,0.5,0.25), (2,2,0.75,0.25)")
c.sql("insert into a values(1,3,1.5,0.5)")
c.sql("select * from a")
c.sql("select a, _R[1] as score, _R[2] as sigma from (select a, rml(list(c), list(d),50,1E-7) as _R from a group by a)")
┌───────┬───────┬───────┬───────┐
│   a   │   b   │   c   │   d   │
│ int32 │ int32 │ float │ float │
├───────┼───────┼───────┼───────┤
│     1 │     1 │   3.0 │   1.0 │
│     1 │     2 │   2.0 │   1.5 │
│     2 │     1 │   0.5 │  0.25 │
│     2 │     2 │  0.75 │  0.25 │
│     1 │     3 │   1.5 │   0.5 │
└───────┴───────┴───────┴───────┘

┌───────┬────────────────────┬─────────────────────┐
│   a   │       score        │        sigma        │
│ int32 │       double       │       double        │
├───────┼────────────────────┼─────────────────────┤
│     1 │ 1.9031885740021728 │  0.5180484599374635 │
│     2 │ 0.6250000000000001 │ 0.17677669529663692 │
└───────┴────────────────────┴─────────────────────┘

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions