Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute_accessibility chunk size #396

Closed
esanchez01 opened this issue Mar 22, 2021 · 2 comments
Closed

compute_accessibility chunk size #396

esanchez01 opened this issue Mar 22, 2021 · 2 comments

Comments

@esanchez01
Copy link
Contributor

I have been working towards optimizing the performance of SANDAG's ActivitySim 3-zone set up (using the latest develop branch). In terms of run time, the main bottleneck has proven to be the compute_accessibility step -- this appears to be due to the step not fitting in RAM on our machine (320GB). I therefore tested smaller chunk sizes to alleviate this issue and have seen reductions in run time. However, I have not been able to fully fit the step in RAM. Problem being that the model run fails after setting too low of a chunk size. In our case, setting a chunk size smaller than 6 billion results in failure (completed at 6 but failed at 5.5).

Looking at the stack trace (included below), it appears that the transit_df for one of the processes is empty and therefore can't be vectorized (there is another error reported for block_offsets but this seems to be a bug in the code -- block_offsets is not guaranteed to be available in the except clause).

It's my understanding that the chunk size is dynamic but still requires an estimate. I'm not certain whether this failure may be due to the chunk size estimates being too small (below 6 billion) or for some other code related reason.

Traceback 21/03/2021 03:47:52 - DEBUG - activitysim.core.chunk - log_df transit_df elements: 0 bytes: 0.0 shape: (0, 4) : accessibility.tvpb_best_time.AM.build_virtual_path.compute_tap_tap_time 21/03/2021 03:47:52 - INFO - activitysim.core.mem - trace_memory_info accessibility.tvpb_best_time.AM.build_virtual_path.compute_tap_tap_time.add.transit_df rss: 0.93GB used: 93.46 GB percent: 29.2% 21/03/2021 03:47:52 - DEBUG - activitysim.core.pathbuilder_cache - MEM #TVPB CACHE compute_tap_tap_utilities all_transit_paths net 0 B (0) total 956 MB in 2.25 s 21/03/2021 03:47:52 - INFO - activitysim.core.pathbuilder - #TVPB CACHE deduped transit_df from 0 to 0 21/03/2021 03:47:52 - ERROR - activitysim.core.skim_dictionary - SkimDict lookup_3d error: ValueError: cannot call `vectorize` on size 0 inputs unless `otypes` is set 21/03/2021 03:47:52 - ERROR - activitysim.core.skim_dictionary - key TRN_IVT_FAST 21/03/2021 03:47:52 - ERROR - activitysim.core.skim_dictionary - orig max nan min nan 21/03/2021 03:47:52 - ERROR - activitysim.core.skim_dictionary - dest max nan min nan 21/03/2021 03:47:52 - ERROR - activitysim.core.skim_dictionary - skim_keys_to_indexes: {'AM': 603, 'MD': 604, 'PM': 605} 21/03/2021 03:47:52 - ERROR - activitysim.core.skim_dictionary - dim3 [] 21/03/2021 03:47:52 - ERROR - activitysim.core.assign - assign_variables - UnboundLocalError (local variable 'block_offsets' referenced before assignment) evaluating: los.get_tappairs3d(df.btap, df.atap, df.tod, 'TRN_IVT_FAST') Traceback (most recent call last): File "c:\users\esanc\.conda\envs\asimtest_multi_develop\lib\site-packages\activitysim\core\skim_dictionary.py", line 329, in lookup_3d block_offsets = np.vectorize(skim_keys_to_indexes.get)(dim3) # this should be faster than map File "c:\users\esanc\.conda\envs\asimtest_multi_develop\lib\site-packages\numpy\lib\function_base.py", line 2108, in __call__ return self._vectorize_call(func=func, args=vargs) File "c:\users\esanc\.conda\envs\asimtest_multi_develop\lib\site-packages\numpy\lib\function_base.py", line 2186, in _vectorize_call ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args) File "c:\users\esanc\.conda\envs\asimtest_multi_develop\lib\site-packages\numpy\lib\function_base.py", line 2142, in _get_ufunc_and_otypes raise ValueError('cannot call `vectorize` on size 0 inputs ' ValueError: cannot call `vectorize` on size 0 inputs unless `otypes` is set

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\activitysim\core\assign.py", line 287, in assign_variables
expr_values = to_series(eval(expression, globals_dict, _locals_dict))
File "", line 1, in
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\activitysim\core\los.py", line 568, in get_tappairs3d
s = self.get_skim_dict('tap').lookup_3d(otap, dtap, dim3, key)
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\activitysim\core\skim_dictionary.py", line 338, in lookup_3d
logger.error(f"dim3 block_offsets {np.unique(block_offsets)}")
UnboundLocalError: local variable 'block_offsets' referenced before assignment
21/03/2021 03:47:52 - DEBUG - activitysim.core.pathbuilder_cache - MEM #TVPB build_virtual_path compute_tap_tap net 624 KB (638976) total 956 MB in 2.62 s
21/03/2021 03:47:52 - ERROR - activitysim.core.assign - assign_variables - UnboundLocalError (local variable 'block_offsets' referenced before assignment) evaluating: tvpb.get_tvpb_best_transit_time(orig=df.orig, dest=df.dest, tod='AM')
Traceback (most recent call last):
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\activitysim\core\skim_dictionary.py", line 329, in lookup_3d
block_offsets = np.vectorize(skim_keys_to_indexes.get)(dim3) # this should be faster than map
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\numpy\lib\function_base.py", line 2108, in call
return self._vectorize_call(func=func, args=vargs)
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\numpy\lib\function_base.py", line 2186, in _vectorize_call
ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
File "c:\users\esanc.conda\envs\asimtest_multi_develop\lib\site-packages\numpy\lib\function_base.py", line 2142, in _get_ufunc_and_otypes
raise ValueError('cannot call vectorize on size 0 inputs '
ValueError: cannot call vectorize on size 0 inputs unless otypes is set

@bstabler
Copy link
Contributor

#397

@bstabler
Copy link
Contributor

This crash should not happen so we'll need to investigate. In the meantime, you can set different chunk sizes for different submodels as shown here

image

@bstabler bstabler mentioned this issue Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants