-
Notifications
You must be signed in to change notification settings - Fork 0
Map for module_id to section_id mapping #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
nix-apollo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look a bit more after lunch.
| def create_module_name_sections( | ||
| self, | ||
| n_blocks: int, | ||
| node_layers: list[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly related to this PR, but I dislike this naming convention of blocks and layers. Imo "layers" implies GPT2 layers. Ideally I'd like it if things were named:
n_blocks->n_layersnode_layers->section_start_modulesor something
Maybe not worth the change over costs though (at least not part of this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm mostly just curious if you agree. If so, should make a low priority git issue about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| self.has_folded_bias = True | ||
|
|
||
| def create_section_id_to_module_id_mapping(self) -> list[tuple[str, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on having convenience functions to get a particular element of the mapping? Eg model.section_to_module and model.module_to_section. Not sure if it's worth it, probably fine to just add later on if we are missing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I guess this is implemented in #183
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep this is handled in #183 (and is actually handled in the current PR at the bottom of the init)
| self.has_folded_bias = True | ||
|
|
||
| def create_section_id_to_module_id_mapping(self) -> list[tuple[str, str]]: | ||
| """Create a list of tuples mapping `sections.section_name.section_idx` (section_id) to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these called "idx" if they are strings not ints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are actually integer indices into a list. If you want to access a module in a SequentialTransformer you can do either:
model.sections.section_2[3]- use the "section_id" of "sections.section_2.3" and do
rib.models.utils.get_model_attr(model, section_id)
Those return the same thing. I think I'll try explain this better in the docstring to SequentialTransformer in Refactor SequentialTransformer for cleanliness #183.
| mappings: list[tuple[str, str]] = seq_model.create_section_id_to_module_id_mapping() | ||
|
|
||
| # Obtained by manually inspecting the model and counting the layers | ||
| expected_mappings = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very confused where these come from, and reading the documentation of the transformer class or the create_section_id_to_module_id_mapping` fn doesn't really help. Probably it's worth clarifying what a "section_id" and "module_id" mean intuitively and how that relates to the node layers?
Or we can accept that this is something people will only understand if it's personally explained to them :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `module_name.block_idx` (module_id). | ||
| Returns: | ||
| A list of tuples where each tuple is (section_id, module_id). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to guarantee order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters provided the resulting dictionaries map things correctly.
| ] | ||
|
|
||
| mappings: list[tuple[str, str]] = seq_model.create_section_id_to_module_id_mapping() | ||
| for expected in expected_mappings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to test the reverse too, but I think that's in #183
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah #183 tests the actual tuples which is better, good spot
nix-apollo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, although do look through comments and consider name changes and doc improvements.
I haven't run the code.
Thanks! I'm going to merge this one, and I'll update #183 with some of your suggested changes clarifications. |
Map for module_id to section_id mapping
Map for module_id to section_id mapping
Description
Adds function
create_section_id_to_module_id_mappingfor mapping section_ids to module_idsRelated Issues
Closes #162
Motivation and Context
It is very hard to know what module_id sections.section_1.17 corresponds to. This creates a mapping as an attribute of a SequentialTransformer instance.
How Has This Been Tested?
Add several tests for this mapping by eyeballing what the mapping should be on a two-layer model, gpt2 and pythia