Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace linear with KAN #173

Closed
wants to merge 2 commits into from
Closed

Conversation

SenmiaoORZ
Copy link

No description provided.

Copy link
Collaborator

@gkielian gkielian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This. is. great.

Also this appears to improve the parameter efficiency and very curious to see the results of inference.

So as part of the review:

  1. first let's wrap this in a module within:
    variations/linear_variations.py

  2. An add the import to model.py

  3. Add an option for kan for linear_variation options of train.py

Afterwards, let's start sweeps for exploring the cross product space with other options : D as well as seeing what this implementationg of this might means on the hardware level

@gkielian gkielian self-assigned this May 23, 2024
@gkielian
Copy link
Collaborator

gkielian commented May 27, 2024

I have been testing out the KAN module, and seems to work better as an MLP replacement than a nn.Linear replacement.

Haven't been able to get stable inference yet with the nn.Linear replacement, but the inference with MLP works well until the model begins overfitting.

Let's hold on adding to the nn.Linear until we can confirm that the inference works, we can speak more on Tuesday about the next steps.

That it does appear to work for the MLP is awesome, and looking forward to beginning discussion of next steps.

@gkielian
Copy link
Collaborator

gkielian commented Jun 5, 2024

@SenmiaoORZ

I just finished making some adjustments, more details in:
SenmiaoORZ#1

In essence, rewrote the Kalnet just little bit to allow for beter polymorphism between existing linear variations, to add argparse param variables for base activation type and polnomial order and other KAN features, and a stability enhancement for forward pass and inference allowing us to use sample.py (and we can probably try again with the attention replacement now too, I think).

@gkielian
Copy link
Collaborator

gkielian commented Jun 7, 2024

Moving work to: #182 for directly merging latest changes to the repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants