Issue search results

Filter by

1k results

(93 ms)inmicrosoft/unilm (press backspace or delete to remove)

microsoft/unilm
在训练的stage2，请问代码是如何将attention冻住的

根据文章描述，在训练好BEiT后，将冻住VFFN和attention，只训练LFFN，但好像在您提供的代码里我没有发现在进行textonlymlm的时候有冻住任何层，可以请教一下是在代码哪里实现的吗

liu-zongxi

Opened
16 days ago

#1727

microsoft/unilm
The download link for trcor is invalid.

Hi，there， I visited the link provided in the readme document, but the access failed. The error message displayed was as follows: This XML file does not appear to have any style information associated ...

ruc-G

Opened
27 days ago

#1726

microsoft/unilm
ReSA Eval crashed

Describe the bug ReSA The problem arises when using: When I m running eval_math_local.sh, it crashed and failed with the import error A clear and concise description of what the bug is. Console Output: ...

MengAiDev

Opened
29 days ago

#1725

microsoft/unilm
Questions on BEiT v2 VQ-KD Pretraining

Thank you for publishing the BEiT v2 code! I’m pretraining BEiT v2 on a custom industrial dataset (1.1 M fault and normal images for training; 200 k normal images for validation) and have a few questions: ...

IKnowWhoo

Opened
on Jul 4

#1724

microsoft/unilm
When will you open source the data synthesis code of paper "Scaling Laws of Synthetic Data for Language Model"

When will you open source the data synthesis code of paper Scaling Laws of Synthetic Data for Language Model

butterluo

Opened
on Jul 3

#1722

microsoft/unilm
LHRMs (Large Hybrid-Reasoning Models) code & checkpoints referenced in paper 2505.14631 not found

Hi UniLM team, The recent paper “Think Only When You Need with Large Hybrid-Reasoning Models” states that its code and models would be released in this repository. I’ve searched the repo (branches, ...

almogtavor

Opened
on May 31

#1719

microsoft/unilm
Differential Transformer loss spikes while training.

I have trained a 1.3B model using both the Differential Transformer and the standard Transformer. I observed a slight improvement in LLM evaluation scores for the Differential Transformer variant, and ...

fasil-saidalavi

Opened
on May 20

#1718

microsoft/unilm
Bug: a potential bug in the `textdiffuser-2/inference_textdiffuser2_t2i_full.py` file.

Hi, I found a potential bug in the textdiffuser-2/inference_textdiffuser2_t2i_full.py file. current_ocr is overwritten as empty list before iteration (line 558) In line 553, current_ocr is correctly ...

dogcdt

Opened
on May 16

#1717

microsoft/unilm
Ablation tests with the same headdim as v with differential transformers

As I understand it, headdim is more important than the number of heads, and the diff transformer chooses to half the number of heads and double the vdim compared to normal transformers. However, wouldn ...

RuiWang1998

Opened
on May 13

#1716

microsoft/unilm
use wavlm to separate mix-audio

hello,I have a question about how to use wavlm to separate mix-audio.Is it possible to use the speech features extracted by WAVLM as the output of certain speech separation networks (such as ConvTasNet) ...

yangwyy

Opened
on May 10

#1715

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

microsoft/unilm
在训练的stage2，请问代码是如何将attention冻住的

microsoft/unilm
The download link for trcor is invalid.

microsoft/unilm
ReSA Eval crashed

microsoft/unilm
Questions on BEiT v2 VQ-KD Pretraining

microsoft/unilm
When will you open source the data synthesis code of paper "Scaling Laws of Synthetic Data for Language Model"

microsoft/unilm
LHRMs (Large Hybrid-Reasoning Models) code & checkpoints referenced in paper 2505.14631 not found

microsoft/unilm
Differential Transformer loss spikes while training.

microsoft/unilm
Bug: a potential bug in the `textdiffuser-2/inference_textdiffuser2_t2i_full.py` file.

microsoft/unilm
Ablation tests with the same headdim as v with differential transformers

microsoft/unilm
use wavlm to separate mix-audio

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:microsoft/unilm language:Python

Filter by

State

Advanced

1k results

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.