Issue search results

Filter by

29 results

(61 ms)inluyug/GradCache (press backspace or delete to remove)

luyug/GradCache
how to deal with same encoder

If I have the same encoder, that is $f=g$, how to use GradCache?

CSWellesSun

Opened
on Jul 25, 2024

luyug/GradCache
Support for Label-Dependent Loss Functions (e.g., Supervised Contrastive Loss)

First of all, thank you for developing GradCache and making it available for the community. It s been incredibly useful for my work. Currently, GradCache supports loss functions that do not require label ...

penguinwang96825

Opened
on Jul 19, 2024

luyug/GradCache
Implement Grokfast into GradCache

I would like to implement the algorithm for grokfast, which is an exponentially weighted mean of past gradients added to the current gradients, with GradCache. I ve been able to use it without GradCache, ...

ben-walczak

Opened
on Jul 9, 2024

luyug/GradCache
traning speed is very slow

Hi, I use grad_cache to train my model, but it seems very slow, I want to konw is this normal? Does using grad cache generally affect the training speed?

liuweie

Opened
on Jun 25, 2024

luyug/GradCache
Role of dot product operation in forward-backward pass

Hello, When reading the implementation, I noticed that in the forward-backward pass, you used a dot-product before running the backward pass, specifically in the following line: https://github.com/luyug/GradCache/blob/0c33638cb27c2519ad09c476824d550589a8ec38/src/grad_cache/grad_cache.py#L241 ...

ahmed-tabib

Opened
on Mar 19, 2024

luyug/GradCache
Questions about training

Hi, it s a great work! We have three inputs designated as i1, i2, and i3, which are to be processed by the llama-7b. For input i1, I will extract two hidden states at two distinct locations and label ...

MikeDean2367

Opened
on Mar 13, 2024

luyug/GradCache
distributed loss for multiple GPUs

Hi Luyu, thank you for your nice work. I have a question on Distributed Contrastive Loss: https://github.com/luyug/GradCache/blob/33695437d104e50a961cd9beba18b55c85a6537a/src/grad_cache/loss.py#L30-L34 ...

x-zb

Opened
on Dec 26, 2023

luyug/GradCache
Multiple outputs implementation

Hello, Suppose my model returns multiple outputs. How should the functional approach be modified to handle this? Thanks.

Soumya-dutta

Opened
on Dec 26, 2023

luyug/GradCache
Gradient update is extremely slow

I am trying to train a Image-Text Contrastive learning model and I am using a Functional Approach. The number of grad steps are 32 and the batch size per step is 32 which makes the total batch size as ...

AshStuff

Opened
on Dec 21, 2023

luyug/GradCache
How to use GradCache in non-single input function?

Great work! I find it works well for X and Y with its own encoder, but for some reason, I have to use the setting: X and Y is with the same shape, X_i and Y_i is the positive sample, X_i and all Y_js are ...

lxx909546478

Opened
on Jun 14, 2023

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Restrict your search to the title by using the in:title qualifier.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

luyug/GradCache
how to deal with same encoder

luyug/GradCache
Support for Label-Dependent Loss Functions (e.g., Supervised Contrastive Loss)

luyug/GradCache
Implement Grokfast into GradCache

luyug/GradCache
traning speed is very slow

luyug/GradCache
Role of dot product operation in forward-backward pass

luyug/GradCache
Questions about training

luyug/GradCache
distributed loss for multiple GPUs

luyug/GradCache
Multiple outputs implementation

luyug/GradCache
Gradient update is extremely slow

luyug/GradCache
How to use GradCache in non-single input function?

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:luyug/GradCache language:Python

Filter by

State

Advanced

29 results

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.