Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage and makes saving constraints optional #1532

Merged
merged 22 commits into from
Dec 8, 2020
Merged

Conversation

bqpd
Copy link
Contributor

@bqpd bqpd commented Dec 1, 2020

  • uses sets of varkeys instead of KeySets in many places
    • deletes KeySets after use for string-substitution during ConstraintSet creation
  • deletes substitution attributes after used for mmaping / dual calculation

Reduces memory consumption for large models with string substitutions during creation by ~30%

For future reference: consider not using KeySets for such one-offs at all, instead doing a one-pass over varkeys. Honestly such iteration is cheap enough that KeySets might be removed entirely, and KeyDicts used only for solutions. This wouldn't lead to large changes in speed/memory, but it would remove a lot of code.

Also makes saving constraints with a solution optional

@bqpd
Copy link
Contributor Author

bqpd commented Dec 1, 2020

test models please

@bqpd bqpd changed the title Reduce memory usage Reduce memory usage and makes saving constraints optional Dec 1, 2020
@bqpd
Copy link
Contributor Author

bqpd commented Dec 1, 2020

test models please

@bqpd
Copy link
Contributor Author

bqpd commented Dec 2, 2020

test models please

@bqpd
Copy link
Contributor Author

bqpd commented Dec 2, 2020

test models please

@bqpd
Copy link
Contributor Author

bqpd commented Dec 4, 2020

test this please

@bqpd
Copy link
Contributor Author

bqpd commented Dec 4, 2020

test models please


def __init__(self, constraints, substitutions=None): # pylint: disable=too-many-branches,too-many-statements
def __init__(self, constraints, substitutions=None, *, bonusvks=None): # pylint: disable=too-many-branches,too-many-statements
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring for bonusvks? Or at the very least an inline comment?

for las, nus, c in zip(la[1:], self.nu_by_posy[1:], self.hmaps[1:]):
while getattr(c, "parent", None) is not None:
c = c.parent
v_ss, c_senss = c.sens_from_dual(las, nus, result)
for vk, x in v_ss.items():
self.v_ss[vk] = x + self.v_ss.get(vk, 0)
while getattr(c, "generated_by", None) is not None:
gpv_ss[vk] = x + gpv_ss.get(vk, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't thought deeply about this, but seems like this is acting a lot like a defaultdict? Could gpv_ss be a default dict, and then this simply becomes gpv_ss[vk] += x? Is defaultdict noticeably slower?

self.v_ss[vk] = x + self.v_ss.get(vk, 0)
while getattr(c, "generated_by", None) is not None:
gpv_ss[vk] = x + gpv_ss.get(vk, 0)
while getattr(c, "generated_by", None):
c = c.generated_by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline comment here? Something like "loop to top of hierarchy"?

before = self.v_ss.get(c, 0)
self.v_ss[c] = before + dlogcost_dlogv*dlogv_dlogc
before = gpv_ss.get(c, 0)
gpv_ss[c] = before + dlogcost_dlogv*dlogv_dlogc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above, seems like defaultdict might simplify this?

@@ -45,7 +45,7 @@ class KeyMap:
collapse_arrays = False
keymap = []
log_gets = False
varkeys = None
cset = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation / inline comment for cset? 😬

if key.veckey is None or key.veckey not in self.substitutions:
continue
if np.isnan(self.substitutions[key.veckey][key.idx]):
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these continue cases seem complex enough that an inline comment would be beneficial

if key.value is not None and not key.constant:
del key.descr["value"]
if key.veckey and key.veckey.value is not None:
del key.veckey.descr["value"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? I've read this a few times and still confused what's going on here... inline comment? (sorry broken record)

@@ -18,9 +18,6 @@ class NomialData(ReprMixin):

def __init__(self, hmap):
self.hmap = hmap
self.vks = set()
for exp in self.hmap:
self.vks.update(exp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

orig_idx = origexps.index(orig_exp)
pmap[selfexps.index(self_exp)][orig_idx] = fraction
return pmap, m_from_ms
return pmap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@bqpd
Copy link
Contributor Author

bqpd commented Dec 8, 2020

test models please

@bqpd bqpd merged commit a436015 into master Dec 8, 2020
@bqpd bqpd deleted the novfix branch December 8, 2020 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants