-
Notifications
You must be signed in to change notification settings - Fork 25
Description
To reproduce, run the following on a Windows machine:
from policyengine_us import Microsimulation
baseline = Microsimulation(dataset="pooled_3_year_cps_2023")
baseline_income = baseline.calculate("household_net_income", period=2025)
This leads to ValueError: expected non-negative integer and the traceback shows the problem occurs in commons\formulas.py near line 338 (I added some debugging lines so it might be a few off):
337 [
--> 338 np.random.default_rng(
339 seed=id * 100 + population.simulation.count_random_calls
340 ).random()
341 for id in entity_ids
342 ]
In the debugger, only on Windows, I was able to see the problem:
(Pdb) entity_ids.max() * 100 + population.simulation.count_random_calls
-1618636895
Channeling Gemini: On your Windows setup, NumPy appears to be performing the multiplication 26763304 * 100 and trying to store the intermediate result (2676330400) back into a 32-bit integer context. Since it doesn't fit, it overflows and wraps around, producing a negative int32 value (specifically -1618636896 for the multiplication part). On your Linux setup, NumPy seems to be handling this differently....
Potential easy fix: I noticed that simply not multiplying by 100 eliminated the negative number problem. entity_ids already has a lot of dispersion, and I'd question whether it is necessary to even add anything to it.