add OpenNARS-style Distributor in Bag #84

maxeeem · 2023-12-18T18:01:59Z

I added the distributor but looking for your suggestions to see if we can improve the implementation. The issue is that initializing the Distributor object is not exactly quick, there are multiple loops in the __init__ method. To avoid recalculation I added a cache in the form of a static method new but as I am myself new to python, perhaps you guys can think of better ways to optimize the calculations. With this change tests run slower but still less than 1 min.

ccrock4t · 2023-12-19T00:20:07Z

Hi Maxim, I will need more info about what is the Distributor

maxeeem · 2023-12-19T17:45:55Z

@ccrock4t you can see the code from OpenNARS.

As the comments suggest (I will add them to this PR shortly)

A pseudo-random number generator, used in Bag.

And

For any number N < range, there is N+1 copies of it in the array, distributed as evenly as possible

This is all following our discussion over email, regarding selection probability in bag being propositional to priority.

Without this change, the result of test_bag_take_task() looks like this (as @bowen-xu pointed out)

After the change it looks like this which I believe is more along the lines of what we want

ccrock4t · 2023-12-19T20:33:55Z

In terms of speed, it seems like that is the cost of using this algorithm. If we can find out the name of the algorithm, maybe there are discussions online about improving it?

pynars/NARS/DataStructures/_py/Bag.py

ccrock4t · 2023-12-19T20:31:41Z

pynars/NARS/DataStructures/_py/Bag.py

-
-        if self._is_current_level_empty():
-            self._move_to_next_nonempty_level()
+        if self._is_current_level_empty() or self.currenCounter == 0:


Can you help me understand why we need currentCounter? It seems to hold the most recent index of an Item taken from a Bucket. What is special about index 0 to make Bag change its current Level?

I suppose we need to change the level at some point, but why not at every Take()?

Can you help me understand why we need currentCounter? It seems to hold the most recent index of an Item taken from a Bucket. What is special about index 0 to make Bag change its current Level?

So this is a fun one :) I basically just converted Java to Python and adapted it, so there may be better ways to write all this stuff.

This is related to the THRESHOLD parameter I was asking about over email. It seems similar to the PyNARS take_in_order as far as I understand. In OpenNARS if currentLevel is below THRESHOLD, then currentCounter will be set to 0 and at next call to take we will go inside the conditional, otherwise we will continue taking from the current level until empty.

I suppose we need to change the level at some point, but why not at every Take()?

The way it's currently implemented it is indeed going to change levels at every take since take_in_order is set to True so the implementation is not complete in that regard. In OpenNARS I also wasn't able to find any instances where it would fire off the level completely as THRESHOLD is always set to 1. So I'm not entirely sure what the use case is for this property or how to fully reconcile the two implementations since Bag in PyNARS has quite a few other methods and the code is different compared to OpenNARS. If you guys want, we can have a design discussion and fully document the Bag implementation and figure out all of these edge cases. I would need your help though.

As it is yeah, if take_in_order is set to False then the logic is kinda weird. There isn't a direct match for it in the Java code so looking to you guys for suggestions on how to handle this.

I see, so the graph is showing the behavior when the level changes every take(), which is good because it matches our expectation. I am not sure about take_in_order itself, but in my opinion for take() we should choose a random level every time, just remove the self.currenCounter unless @bowen-xu sees it should be used for take_in_order.

I agree it is a good idea to document Bag more thoroughly. We could put more info in the Conceptual Design document we have going

take_in_order does not appear in the conceptual design. It can be removed from the code.
I didn't really read the code of Bag in OpenNARS 3.0.4; but I only referred to NARS-Python before. I just found that the previous implementation of Bag in PyNARS is not consistant with OpenNARS.

I suggest that we shall faithfully adopt the orignial design of OpenNARS in terms of Bag. So thanks @maxeeem for helping us consider this issue carefully!

@maxeeem Why not just re-implement the Bag data-structure in python, so that it could be fully the same as the java version?

I can look into aligning the implementations and porting it from Java to Python. I did notice that Buffer uses take_in_order=False so we'll need to examine that as well and I imagine Channels which inherits from Buffer.

ccrock4t · 2023-12-21T02:16:29Z

It seems a Distributor with value $N$ takes $O(N^2)$ time for initialization, so that is why the slowdown is occurring.

My initial thoughts: the self.order array is being filled one value at a time (in a loop) using local variables rank, time, and index. Can we simply save the state of those variables as members of the object, then calculate the value for self.order on the fly, whenever pick() is called, using the same logic in __init__?

EDIT: it will probably not be that simple

bowen-xu · 2023-12-20T14:40:24Z

pynars/NARS/DataStructures/_py/Bag.py

        self.levels = tuple(list() for i in range(self.n_levels))  # initialize buckets between 0 and capacity
+
+        self.currentCounter = 0


the "lower camel case" style is better

Thanks for pointing this out. Old habits are hard to break :) I'll push up a fix shortly.

bowen-xu · 2023-12-21T03:56:43Z

It seems a Distributor with value N takes O(N2) time for initialization, so that is why the slowdown is occurring.

My initial thoughts: the self.order array is being filled one value at a time (in a loop) using local variables rank, time, and index. Can we simply save the state of those variables as members of the object, then calculate the value for self.order on the fly, whenever pick() is called, using the same logic in __init__?

EDIT: it will probably not be that simple

@ccrock4t Is the slowdown significant? I found that the initialization takes only around 8ms in my laptop.

ccrock4t · 2023-12-21T04:36:10Z

@ccrock4t Is the slowdown significant? I found that the initialization takes only around 8ms in my laptop.

Not for the current default number of buckets (100), but the time for initialization could become an issue if we need to significantly increase the granularity of Bag (e.g., to 10,000 or more).

Since it is only a 1-time operation, the slowdown would not really impact user experience during NARS operation. Only on startup. However the memory also takes up $O(n^2)$.

ccrock4t · 2023-12-21T04:36:32Z

@bowen-xu

bowen-xu · 2023-12-21T04:39:47Z

@ccrock4t
I agree that "the time for initialization could become an issue if we need to significantly increase the granularity of Bag (e.g., to 10,000 or more)." However, do we really need 10,000 levels? I think 100 levels is enough, as we don't need such high accuracy for priority. The control mechanism is not accurate.

ccrock4t · 2023-12-21T04:46:20Z

do we really need 10,000 levels? I think 100 levels is enough

I agree it is OK for now, since we only tend to use 100 buckets.

But, if we want to try more a lot more buckets in the future, we should revisit this.

add OpenNARS-style Distributor in Bag

c6801b5

maxeeem requested review from ccrock4t and bowen-xu December 18, 2023 18:01

maxeeem self-assigned this Dec 18, 2023

maxeeem linked an issue Dec 18, 2023 that may be closed by this pull request

[Bug] Add OpenNARS-style Distributor in Bag #83

Closed

use lru-cache instead of cache for backwards compatibility

a16503f

maxeeem mentioned this pull request Dec 19, 2023

[Bug] Add OpenNARS-style Distributor in Bag #83

Closed

adding comments to Distributor class; modifying test_bag_take_task

8d2005e

maxeeem marked this pull request as ready for review December 19, 2023 17:58

ccrock4t reviewed Dec 19, 2023

View reviewed changes

fix typo

c1ad5de

bowen-xu reviewed Dec 21, 2023

View reviewed changes

style fixes

f9c4270

bowen-xu merged commit 0632551 into dev Jan 11, 2024
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add OpenNARS-style Distributor in Bag #84

add OpenNARS-style Distributor in Bag #84

maxeeem commented Dec 18, 2023

ccrock4t commented Dec 19, 2023

maxeeem commented Dec 19, 2023

ccrock4t commented Dec 19, 2023

ccrock4t Dec 19, 2023

ccrock4t Dec 19, 2023

maxeeem Dec 20, 2023

ccrock4t Dec 20, 2023

bowen-xu Dec 21, 2023 •

edited

Loading

maxeeem Dec 21, 2023

ccrock4t commented Dec 21, 2023 •

edited

Loading

bowen-xu Dec 20, 2023

maxeeem Dec 21, 2023

bowen-xu commented Dec 21, 2023

ccrock4t commented Dec 21, 2023 •

edited

Loading

ccrock4t commented Dec 21, 2023

bowen-xu commented Dec 21, 2023 •

edited

Loading

ccrock4t commented Dec 21, 2023 •

edited

Loading

		self.levels = tuple(list() for i in range(self.n_levels)) # initialize buckets between 0 and capacity

		self.currentCounter = 0

add OpenNARS-style Distributor in Bag #84

add OpenNARS-style Distributor in Bag #84

Conversation

maxeeem commented Dec 18, 2023

ccrock4t commented Dec 19, 2023

maxeeem commented Dec 19, 2023

ccrock4t commented Dec 19, 2023

ccrock4t Dec 19, 2023

Choose a reason for hiding this comment

ccrock4t Dec 19, 2023

Choose a reason for hiding this comment

maxeeem Dec 20, 2023

Choose a reason for hiding this comment

ccrock4t Dec 20, 2023

Choose a reason for hiding this comment

bowen-xu Dec 21, 2023 • edited Loading

Choose a reason for hiding this comment

maxeeem Dec 21, 2023

Choose a reason for hiding this comment

ccrock4t commented Dec 21, 2023 • edited Loading

bowen-xu Dec 20, 2023

Choose a reason for hiding this comment

maxeeem Dec 21, 2023

Choose a reason for hiding this comment

bowen-xu commented Dec 21, 2023

ccrock4t commented Dec 21, 2023 • edited Loading

ccrock4t commented Dec 21, 2023

bowen-xu commented Dec 21, 2023 • edited Loading

ccrock4t commented Dec 21, 2023 • edited Loading

bowen-xu Dec 21, 2023 •

edited

Loading

ccrock4t commented Dec 21, 2023 •

edited

Loading

ccrock4t commented Dec 21, 2023 •

edited

Loading

bowen-xu commented Dec 21, 2023 •

edited

Loading

ccrock4t commented Dec 21, 2023 •

edited

Loading