[Work in Progress DO NOT MERGE] Port math library from clojure to python #1893

jucor · 2025-01-30T17:22:05Z

⚠️ [DO NOT MERGE, FILED HERE FOR TRACKING PROGRESS!] ⚠️

Context

We (polis core team and advisors) have been discussing for the least few years about whether Clojure was still the optimal language for the math library, given the evolution of the landscape and of polis needs.
This came up again when @metasoarous raised potential performance issues #1579 (comment) .

Porting any codebase, let alone in this case 7000+ lines of scientific Clojure written by a very smart developer (hats off @metasoarous !), with tons of embedded real-world safeguards, and ten years of battle-testing is a Very Big Endeavour, super risky. Most of all, we need to keep all the domain knowledge that is embedded in the current codebase. This is not a rewrite from a scratch, but a port!

So, crazy, but there's a lot to gain (massive ML ecosystem: people, libraries, etc), so we would be remiss not to at least explore how far we can go. Worst that can happen is that this completely fails, we lost time and I've got egg on my face. I'll mitigate the former by still working on the new LLM features with @colinmegill, and for the latter, well, I can live with that :)

So let's go!

Plan

I'll be focusing first on the core functionality: the math. Once that is clear and done, then I will work on the poller and runner. I'll be using numpy for all vector operations.

Preparation

0/ beside the current existing unit tests, have an integration test that updates through a full conversation.
1a/ hack a basic json serialize/deserialize clojure function for vectors/matrices, and similar for python
1b/ write a basic clojure function that serializes its arguments, calls a python command , deserializes the return

Core iteration

then, iterating this way:

2/ identify one core function in the clojure game (pre-filtering, core PCA, etc) , and its unit tests if any
3/ implement that exact function in python with numpy, with unit tests, possibly adding some
3/ replace the clojure function by a serialization + call to the replacing python function + deserialization
4/ check that the end result of the full pipeline with and without python is still the same, on one or more real conversations from the database (in addition to the unit tests ofc).
5/ Go to 2 with another core function, then climb up the call tree.

Expectations

It'll be a real slog at first for step 0 and 1, getting familiar with running the various functions one by one in clojure when needed (i.e. without the realtime poller etc, which I'll keep for the end), but pace should then increase.

Performance should also mechanically improve, as per #1579 and #1062 and #1580 .

Math part will be fun -- although we might start to see some small numerical differences appearing as we go, hopefully keeping them small.

Now to figure out what is stored where :)

ballPointPenguin · 2025-01-31T02:57:18Z

This would be amazing!

We serialize to JSON and, optionally, to temp file. We probably will need to expand to take kw-args into account.

And some unit tests for fun, and even coverage!

The makefile will avoid having to remember the test syntax, making them frictionless to run. The partial runs are a way to keep focusing on just the output of whichever part we want -- although it doesn't save the biggest drag: clojure launch time.

jucor · 2025-01-31T11:01:10Z

We now have serialization of arguments in Clojure, with tests (in Docker). And Serialization+Deserialization of arguments in Python, with tests (locally, not in Docker).
And a full running makefile calling those tests.

Next: wrapping a clojure function with a serializer of input and output, and a python script to replay that call but to the python function, and check the output matches.

jucor · 2025-01-31T11:43:18Z

I am adding notes to README.pythonport.md, but also pasting them in this thread for visibilty.

PCA

Let's start with the PCA, as it's well known and nicely isolated.

Let's first:

Task 1: document the calling graph of PCA to get a lay of the land
Task 2: add more clojure tests to the PCA functions
Task 3: code the Python PCA and the tests
Task 4: wrap the clojure PCA call to store its input and output, and run it on a basic conversation.
Task 5: load the input from clojure and run the PCA on that.
Task 6: load a big conversation into the database
Task 7: record PCA input/output on that, and compare.

Checking its clojure calling graph in pca.clj:

graph TD
    wrapped_pca[wrapped-pca] --> powerit_pca[powerit-pca]
    powerit_pca --> power_iteration[power-iteration]
    powerit_pca --> rand_starting_vec[rand-starting-vec]
    powerit_pca --> factor_matrix[factor-matrix]
    
    power_iteration --> xtxr[xtxr]
    power_iteration --> repeatv[repeatv]
    
    factor_matrix --> proj_vec[proj-vec]
    
    xtxr --> repeatv
    
    pca_project[pca-project]
    
    sparsity_aware_project_ptpts[sparsity-aware-project-ptpts] --> sparsity_aware_project_ptpt[sparsity-aware-project-ptpt]
    
    pca_project_cmnts[pca-project-cmnts] --> sparsity_aware_project_ptpts

- Add cloverage - Explicitely deactivate conv-man tests, which require a full database, as warned in the comments. - Fix a bug in the new test runner when parsing command line arguments.

jucor · 2025-01-31T12:27:32Z

We want to make sure matching our python and clojure tests cover as much code as possible, so we need to measure it. Hence I have added coverage measurement to the test runner.

Now focusing on PCA coverage, we see some codepaths not tested, so let’s add tests for that.

I am trying to get full _branch_ coverage of the function `power-iteration`, but failing to. There must be some corner cases I am not surfacing (and Cursor neither). I will leave it aside for now, make a note that it needs fixed, and will go to cover the entirely missing lines in the wrapped-pca function.

jucor · 2025-01-31T13:28:46Z

In spite of lots of PCA tests, I do not get full branch coverage in the key function power-iteration. To avoid going crazy, I will move to the simpler line-coverage of other functions, and will come back to branch coverage.

I do wonder whether we really need full branch coverage, knowing that the PCA method, while currently being an elegantly manually coded power-iteration, might eventually be passed to SKlearn or Lapack. But before that, we will need to check whether the iterative nature of the power-iteration is exploited for incremental updates of the conversation, as I suspect it is. So for now, we will stick to power-iteration.

See compdemocracy#1894 and `README.pythonport.md` in this commit for explanations of why the code is unreachable.

jucor · 2025-01-31T16:05:23Z

pca-project does not seem to be called anywhere. conversation.clj calls sparsity-aware-project-ptpt[s] instead, which makes sense. Therefore, deleting this function to remove dead code.

jucor · 2025-02-12T17:48:51Z

Just want to give kudos to @metasoarous for his 2014 notes on his hunt for performance back then, living next to the code in https://github.com/compdemocracy/polis/blame/52652e443c52bb554536ef029ce4c951826039c3/math/perf . It's super helpful to see what scenarios you envisaged, what things you were paying attention to, etc! Very useful context, thanks Chris !

metasoarous · 2025-02-12T19:33:45Z

Regarding the purported bug with "including participants that should not be included", this may have actually been intended. If someone joins the conversation before there are 7 comments, we include them in the results as soon as they vote on all available comments. And once someone is in the conversation, I think it would be bad idea to remove them. Now, we can certainly debate the extent to which this behavior is worthwhile; The rationale at the time was that this would help smooth out the experience on new conversations. This is a pretty marginal benefit for most real world usage these days though, and could be looked at as an artifact of early-days startup priorities. I don't know that new implementation needs to necessarily cargo-cult this decision if it's outgrown its utility, but did want to flag the original rationale. If there are other issues though, do please raise a dedicated issue with additional context!

One final thought: It sounds like this may not be relevant if the goal of the bindings is just to test/compare behavior until the swap is made. But in case you find it useful, there's a fantastic project called libpython-clj which offers zero-copy bindings between Clojure and Python data structures.

jucor · 2025-02-12T19:36:04Z

Super helpful context, thanks @metasoarous !

Note: the timing comparisons at the moment are very unfair to clojure, because in my dev setup clojure runs in docker whereas (for now) python runs on the full host. Thus python has access to more cores if numpy is properly multicore-optimized, and its BLAS calls are also compiled for the actual hardware (Apple Silicon), which is probably not the case in a linux docker image running on my mac I believe. So, fair comparison will be once we have all running in Docker. And ideally on a machine whose silicon matches the docker image. - Improve result comparison with more detailed type checking - Add timing information for Python vs Clojure function execution - Introduce performance summary with speedup calculations - Add more informative discrepancy reporting

jucor · 2025-02-25T12:48:37Z

After discussing with @colinmegill yesterday

Updating our plan:

Instead of going purely from the leaves of the call-tree and up the call stack, which only allows one big shipping at the end, I will:

Write a python worker that loads from the DB the math info computed by clojure.
In parallel:
- Add new math features straight to python as needed.
- While progressively redoing/replacing the math computations, eating progressively over the clojure code.

This:

allows faster feedback for iterations
allows more colleagues to jump in the python code sooner (@DZNarayanan :) )
ships features as soon as they are ready, rather than waiting for an integral port

TEST DOES NOT RUN YET, Type error. That's normal, still working on this.. 1. Porting repness to python by loading group info from JSON blob computed by clojure 2. Testing that the python representative comments are the same as the clojure ones.

The computation in python runs, the test compares it correctly to the values computed by clojure and stored in the DB. However, the results are not yet equal. Still working on this.

Previous implementations were completely missing the moderation. Now added. Let's see if it runs...

Remove some dead code trailing around

Update: after adding filtering of the matrix based on moderation, I'm getting closer. On BG2018, out of 5 representative comments selected by Clojure for each group, the python gets 4 right, and misses. I'll drill down on why that is. Probably still getting one count wrong in the p-test, i.e. in how I account for modded-out comments.

jucor · 2025-03-07T21:42:19Z

Update: due to time-sensitivity of exporting repness via a python microservice, I've opened #1954 . Not a port of the clojure code: uses the analysis notebooks as asked by @colinmegill .
Referring it here as it's still part of the broader move of math towards python.

NewJerseyStyle · 2025-03-15T04:34:53Z

Sorry to side track here, I didn't find other issue related to the discussion of RL topic.

I find a paper it mentioned some methods to measures creativity of participants on given task with/without AI assistant, with the P-creativity (psychological creativity) and H-creativity (historical creativity)
https://arxiv.org/abs/2410.03703

Thought that can be reward for the RL system

Heyyy, just saw this RL paper https://www.nature.com/articles/s41562-023-01686-7
Guess that's something you were interested. I thought about that but I prefer to ensure the information propagation went through as many people as possible for mutual understanding so I prefer a ranking system or something for message passing. But if we consider the output of RL as ranking (high score send first, low score get back later) it works too.

jucor · 2025-03-26T14:00:23Z

Tidying up by closing this pull request, as this approach does not provide the rapid shipping and iteration that CompDem prefers. I have, however, transferred the ownership of the repo with this branch to @colinmegill , in case it can ever be helpful in future.

Kick off with a README

74dd0c8

jucor changed the title ~~[Work in Progress DO NOT MERGE] Update math library from clojure to python~~ [Work in Progress DO NOT MERGE] Port math library from clojure to python Jan 30, 2025

jucor added 4 commits January 30, 2025 17:26

Write down the plan in README

e0367ba

Figure out how to run tests and single-convo update

6c57efc

Now to figure out what is stored where :)

Fix typo

bb6f82a

Fix typo

822ad9b

jucor added 5 commits January 31, 2025 09:01

Add serialization to JSON

8bafa94

We serialize to JSON and, optionally, to temp file. We probably will need to expand to take kw-args into account.

Add python serialization/deserialization

ef14068

And some unit tests for fun, and even coverage!

Document the makefile

548c0f8

Check python coverage at test by default

3a65ef4

Detail the plan a bit more.

2d04620

Report coverage in clojure tools

28ceec1

- Add cloverage - Explicitely deactivate conv-man tests, which require a full database, as warned in the comments. - Fix a bug in the new test runner when parsing command line arguments.

Update notes

112154d

jucor mentioned this pull request Jan 31, 2025

wrapped-pca does not treat single-row and single-column matrices as intended #1894

Closed

jucor added 2 commits January 31, 2025 15:21

Add tests for wrapped-pca and remove unreachable code

afcbf19

See compdemocracy#1894 and `README.pythonport.md` in this commit for explanations of why the code is unreachable.

Remove unused function

7850848

jucor added 6 commits February 3, 2025 10:30

Add easier single-convo runner alias

f9386b1

Add SQL command to makefile

ccb1905

Document database import

c6b3e31

Actually document the new make entries

ce36a81

Fix makefile phony targets

83162d2

Fix typos in Makefile

72a38ec

Add some indication of what differs

3412337

jucor marked this pull request as draft February 15, 2025 23:02

jucor force-pushed the math-python branch 4 times, most recently from 3702944 to 3f8f82a Compare February 20, 2025 14:51

jucor closed this Feb 20, 2025

jucor deleted the math-python branch February 20, 2025 15:32

jucor restored the math-python branch February 20, 2025 15:33

jucor reopened this Feb 20, 2025

jucor force-pushed the math-python branch from 3f8f82a to 47ab2ea Compare February 20, 2025 15:41

jucor added 9 commits February 25, 2025 13:08

Prepare a table to store the Python math results

56cd0b7

First stab at porting repness

3a9bd4c

TEST DOES NOT RUN YET, Type error. That's normal, still working on this.. 1. Porting repness to python by loading group info from JSON blob computed by clojure 2. Testing that the python representative comments are the same as the clojure ones.

Get repness comparison to run - but not equal

fbd6710

The computation in python runs, the test compares it correctly to the values computed by clojure and stored in the DB. However, the results are not yet equal. Still working on this.

Handle moderation

2466074

Previous implementations were completely missing the moderation. Now added. Let's see if it runs...

Get test running -- not passing

79f683e

Remove some dead code trailing around

Add debug info

f22dd2a

Print more result table

c8b7458

Add debug info + Adapt call

3008bfe

jucor mentioned this pull request Mar 10, 2025

Import Repness code from notebooks #1954

Closed

jucor closed this Mar 26, 2025

[Work in Progress DO NOT MERGE] Port math library from clojure to python #1893

[Work in Progress DO NOT MERGE] Port math library from clojure to python #1893

Uh oh!

Conversation

jucor commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Plan

Preparation

Core iteration

Expectations

Uh oh!

ballPointPenguin commented Jan 31, 2025

Uh oh!

jucor commented Jan 31, 2025

Uh oh!

jucor commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PCA

Uh oh!

jucor commented Jan 31, 2025

Uh oh!

jucor commented Jan 31, 2025

Uh oh!

jucor commented Jan 31, 2025

Uh oh!

jucor commented Feb 12, 2025

Uh oh!

metasoarous commented Feb 12, 2025

Uh oh!

jucor commented Feb 12, 2025

Uh oh!

jucor commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updating our plan:

Uh oh!

jucor commented Mar 7, 2025

Uh oh!

NewJerseyStyle commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jucor commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jucor commented Jan 30, 2025 •

edited

Loading

jucor commented Jan 31, 2025 •

edited

Loading

jucor commented Feb 25, 2025 •

edited

Loading

NewJerseyStyle commented Mar 15, 2025 •

edited

Loading

jucor commented Mar 26, 2025 •

edited

Loading