Skip to content

Format image#14

Merged
EiffL merged 46 commits intomainfrom
format_image
May 23, 2025
Merged

Format image#14
EiffL merged 46 commits intomainfrom
format_image

Conversation

@lhparker1
Copy link
Member

Update AION code to have image preprocessing native in the codec

@lhparker1 lhparker1 requested a review from EiffL May 22, 2025 21:04
@EiffL
Copy link
Contributor

EiffL commented May 23, 2025

I've made a few changes, it's important to respect and not modify the base Codec class.

It's not yet passing the tests but almost

Copy link
Collaborator

@LTMeyer LTMeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's getting shapes! Thank you. I made a couple of comments.

return self.encode(x, channel_mask)


class QuantizedCodec(Codec):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this class?
It made the distinction simpler between codec that rely on quantization and tokenizer that don't. It avoids the if statements in several methods. Do we actually have any tokenizer that is not using a quantizer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to do minimal changes. I didn't want to change the codec APIs, but I guess that ship has sailed ^^'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just give me this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll change ^^. But just to explain why I didn't want to change the API here:

My objective was not to make a clean code base here, for two reasons:

  • The more we refactor, the more work it is to refactor all tokenizers + the higher the chance of introducing difference in behavior
  • This code is going to stay 100% frozen, we are not going to reuse any of it moving forward. So, this is not the right place to do a lot of refactoring, that would be on the MMOMA side.

Making small edits on what relates to the directly relevant user-facing API (i.e. encode/decode) that makes sense from the perspective of simplifying the user experience, but the rest I would have preferred not to change anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hummm actually, looking at the code, I'd rather not change it.

All the codecs we have for AION-1 are QuantizedCodec, no point in adding subclasses. I'll remove the 'is_quantized' statements if you want, but no need for more complexity

EiffL and others added 4 commits May 23, 2025 17:30
Co-authored-by: Lucas Meyer <LTMeyer@users.noreply.github.com>
Co-authored-by: Lucas Meyer <LTMeyer@users.noreply.github.com>
@EiffL EiffL merged commit 0ce9b47 into main May 23, 2025
2 checks passed
@EiffL EiffL deleted the format_image branch May 23, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants