Skip to content

issues Search Results · repo:luyug/GradCache language:Python

Filter by

29 results
 (61 ms)

29 results

inluyug/GradCache (press backspace or delete to remove)

If I have the same encoder, that is $f=g$, how to use GradCache?
  • CSWellesSun
  • Opened 
    on Jul 25, 2024
  • #33

First of all, thank you for developing GradCache and making it available for the community. It s been incredibly useful for my work. Currently, GradCache supports loss functions that do not require label ...
  • penguinwang96825
  • Opened 
    on Jul 19, 2024
  • #32

I would like to implement the algorithm for grokfast, which is an exponentially weighted mean of past gradients added to the current gradients, with GradCache. I ve been able to use it without GradCache, ...
  • ben-walczak
  • 2
  • Opened 
    on Jul 9, 2024
  • #31

Hi, I use grad_cache to train my model, but it seems very slow, I want to konw is this normal? Does using grad cache generally affect the training speed?
  • liuweie
  • 6
  • Opened 
    on Jun 25, 2024
  • #30

Hello, When reading the implementation, I noticed that in the forward-backward pass, you used a dot-product before running the backward pass, specifically in the following line: https://github.com/luyug/GradCache/blob/0c33638cb27c2519ad09c476824d550589a8ec38/src/grad_cache/grad_cache.py#L241 ...
  • ahmed-tabib
  • Opened 
    on Mar 19, 2024
  • #29

Hi, it s a great work! We have three inputs designated as i1, i2, and i3, which are to be processed by the llama-7b. For input i1, I will extract two hidden states at two distinct locations and label ...
  • MikeDean2367
  • Opened 
    on Mar 13, 2024
  • #28

Hi Luyu, thank you for your nice work. I have a question on Distributed Contrastive Loss: https://github.com/luyug/GradCache/blob/33695437d104e50a961cd9beba18b55c85a6537a/src/grad_cache/loss.py#L30-L34 ...
  • x-zb
  • 4
  • Opened 
    on Dec 26, 2023
  • #25

Hello, Suppose my model returns multiple outputs. How should the functional approach be modified to handle this? Thanks.
  • Soumya-dutta
  • 1
  • Opened 
    on Dec 26, 2023
  • #24

I am trying to train a Image-Text Contrastive learning model and I am using a Functional Approach. The number of grad steps are 32 and the batch size per step is 32 which makes the total batch size as ...
  • AshStuff
  • 2
  • Opened 
    on Dec 21, 2023
  • #23

Great work! I find it works well for X and Y with its own encoder, but for some reason, I have to use the setting: X and Y is with the same shape, X_i and Y_i is the positive sample, X_i and all Y_js are ...
  • lxx909546478
  • Opened 
    on Jun 14, 2023
  • #22
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub