Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclusion of unnecessary parameters from AdvancedTreeSearchLmImageAndGlobalCacheJob hash calculation #514

Open
Marvin84 opened this issue Jun 18, 2024 · 16 comments · May be fixed by #446
Open

Comments

@Marvin84
Copy link
Contributor

Marvin84 commented Jun 18, 2024

Recently, I discovered that one terabyte of my space is occupied by different AdvancedTreeSearchLmImageAndGlobalCacheJobs that differ only by transition distortion penalty (TDP) values. However, all TDPs are part of our dynamic search procedure meaning that they are not relevant for the static search space and the global cache.

I am opening this issue to discuss the most clean way to exclude this or other parameters from the hash calculation. After a short discussion with @vieting, we think there are two different scenarios one can explore:

  1. Pass two separate crps to the AdvancedTreeSearchJob, where the additional crp, if not None, is used here. Otherwise the old behavior is performed.
  2. Change the hash calculation here, with two possible options, and by using a flag that keeps the default behavior
  • Deleted all non-relavant parameters from the RasrConfig if a default value within RASR is defined
  • Set some constant value where no default value is available

Please let me know what are your thoughts, before I prepare a PR.

@JackTemaki
Copy link
Contributor

This is possibly a duplicate of #430

I am not really familiar with the crp handling of RASR, so I can not tell what the best solution is, but deleting non-relevant parameters sounds like breaking existing setups.

@JackTemaki
Copy link
Contributor

@Marvin84 would #446 be a viable solution to your problem?

@Marvin84
Copy link
Contributor Author

This is possibly a duplicate of #430

I am not really familiar with the crp handling of RASR, so I can not tell what the best solution is, but deleting non-relevant parameters sounds like breaking existing setups.

I edited the original text, of course I meant that there is a flag that conserves the default behavior

@Marvin84
Copy link
Contributor Author

Marvin84 commented Jun 18, 2024

@Marvin84 would #446 be a viable solution to your problem?

The question is: why isn't it merged? I was not able to go through all the discussions there

@JackTemaki
Copy link
Contributor

The question is: why isn't it merged? I was not able to go through all the discussions there

@michelwi still had open changes requested, and I assume @NeoLegends just changed his priorities as there was not enough interest from the rest to have it solved.

@NeoLegends
Copy link
Contributor

NeoLegends commented Jun 18, 2024

Yeah, #430 points at exactly the same issue. Feel free to take over #446. I have other priorities at the moment.

@Marvin84
Copy link
Contributor Author

@NeoLegends could you do a summary of why #446 was not merged yet and what were the open questions?

@JackTemaki
Copy link
Contributor

@Marvin84 this is already answered, you can see the list of @michelwi requested changes at the end of the PR

@michelwi
Copy link
Contributor

#446 would reduce the problem but not mitigate it completely: We would not duplicate the lm image which takes up most of the wasted storage, but we would still have multiple unnecessary duplicates of the global cache.

So we can leave the LmImageAndGlobalCacheJob as is (and deprecate it) but apply the same discussion points to the separate GlobalCacheJob.

@Marvin84
Copy link
Contributor Author

Marvin84 commented Jun 18, 2024

we would still have multiple unnecessary duplicates of the global cache.

Exactly.

So we can leave the LmImageAndGlobalCacheJob as is (and deprecate it) but apply the same discussion points to the separate GlobalCacheJob.

Not sure if I understand what you are proposing.
So, you are saying let's have two separate jobs (LMImageJob and GlobalCacheJob) and for the GlobalCacheJob find a solution from the list of solutions I proposed above?

@michelwi
Copy link
Contributor

We already have two separate jobs. I say:

  1. Lets use them (this is what Add option to generate LM image and GC via two separate jobs #446 is about)
  2. Find a solution for the GlobalCacheJob e.g. from the list you proposed above

@Marvin84
Copy link
Contributor Author

I say

Yeah agreed.

We already have two separate jobs

If this is not a merged PR, then I would not say that we have them. Is it correct as @JackTemaki said, that in order to merge it we should address your last list of comments? Or is there anything changed in the meanwhile?

@michelwi
Copy link
Contributor

We already have two separate jobs

If this is not a merged PR, then I would not say that we have them.

The jobs exist. They are just not used currently in the i6 standard pipelines.

Is it correct as @JackTemaki said, that in order to merge it we should address your last list of comments? Or is there anything changed in the meanwhile?

I haven't looked at it in a while, but I assume I still hold to my comments. But I can give it another look later.

@Marvin84
Copy link
Contributor Author

But I can give it another look later.

that would be great, thank you

@michelwi
Copy link
Contributor

But I can give it another look later.

Upon inspection I found all previous points (of me and others) still valid and I found even more problems that I added to the review.

@Marvin84
Copy link
Contributor Author

Thank you, I will look into this next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment