Skip to content

Conversation

@danbraunai-apollo
Copy link
Contributor

Map for module_id to section_id mapping

Description

Adds function create_section_id_to_module_id_mapping for mapping section_ids to module_ids

Related Issues

Closes #162

Motivation and Context

It is very hard to know what module_id sections.section_1.17 corresponds to. This creates a mapping as an attribute of a SequentialTransformer instance.

How Has This Been Tested?

Add several tests for this mapping by eyeballing what the mapping should be on a two-layer model, gpt2 and pythia

Copy link
Contributor

@nix-apollo nix-apollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look a bit more after lunch.

def create_module_name_sections(
self,
n_blocks: int,
node_layers: list[str],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to this PR, but I dislike this naming convention of blocks and layers. Imo "layers" implies GPT2 layers. Ideally I'd like it if things were named:

  • n_blocks -> n_layers
  • node_layers -> section_start_modules or something

Maybe not worth the change over costs though (at least not part of this PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm mostly just curious if you agree. If so, should make a low priority git issue about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I like this. I think section_start_module_ids is better once we have the formal introduction of what a module_id is in #183. Created Issue #190 for this


self.has_folded_bias = True

def create_section_id_to_module_id_mapping(self) -> list[tuple[str, str]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on having convenience functions to get a particular element of the mapping? Eg model.section_to_module and model.module_to_section. Not sure if it's worth it, probably fine to just add later on if we are missing them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I guess this is implemented in #183

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep this is handled in #183 (and is actually handled in the current PR at the bottom of the init)

self.has_folded_bias = True

def create_section_id_to_module_id_mapping(self) -> list[tuple[str, str]]:
"""Create a list of tuples mapping `sections.section_name.section_idx` (section_id) to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these called "idx" if they are strings not ints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are actually integer indices into a list. If you want to access a module in a SequentialTransformer you can do either:

  1. model.sections.section_2[3]
  2. use the "section_id" of "sections.section_2.3" and do rib.models.utils.get_model_attr(model, section_id)
    Those return the same thing. I think I'll try explain this better in the docstring to SequentialTransformer in Refactor SequentialTransformer for cleanliness #183.

mappings: list[tuple[str, str]] = seq_model.create_section_id_to_module_id_mapping()

# Obtained by manually inspecting the model and counting the layers
expected_mappings = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very confused where these come from, and reading the documentation of the transformer class or the create_section_id_to_module_id_mapping` fn doesn't really help. Probably it's worth clarifying what a "section_id" and "module_id" mean intuitively and how that relates to the node layers?

Or we can accept that this is something people will only understand if it's personally explained to them :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep this is a bit awkward. I'll try explain a little better in #183. I think that docstring, along with the new naming if addressed in #190 should make it a bit clearer.

`module_name.block_idx` (module_id).
Returns:
A list of tuples where each tuple is (section_id, module_id).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to guarantee order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters provided the resulting dictionaries map things correctly.

]

mappings: list[tuple[str, str]] = seq_model.create_section_id_to_module_id_mapping()
for expected in expected_mappings:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to test the reverse too, but I think that's in #183

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah #183 tests the actual tuples which is better, good spot

Copy link
Contributor

@nix-apollo nix-apollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, although do look through comments and consider name changes and doc improvements.

I haven't run the code.

@danbraunai-apollo
Copy link
Contributor Author

Lgtm, although do look through comments and consider name changes and doc improvements.

I haven't run the code.

Thanks! I'm going to merge this one, and I'll update #183 with some of your suggested changes clarifications.

@danbraunai-apollo danbraunai-apollo merged commit 6267954 into main Nov 13, 2023
CindyWuApollo pushed a commit that referenced this pull request Dec 5, 2023
Map for module_id to section_id mapping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make a mapping function from sections to node_layer names

4 participants