issues Search Results · repo:microsoft/unilm language:Python
Filter by
1k results
(93 ms)1k results
inmicrosoft/unilm (press backspace or delete to remove)根据文章描述,在训练好BEiT后,将冻住VFFN和attention,只训练LFFN,但好像在您提供的代码里我没有发现在进行textonlymlm的时候有冻住任何层,可以请教一下是在代码哪里实现的吗
liu-zongxi
- Opened 16 days ago
- #1727
Hi,there, I visited the link provided in the readme document, but the access failed. The error message displayed was as
follows:
This XML file does not appear to have any style information associated ...
ruc-G
- Opened 27 days ago
- #1726
Describe the bug ReSA
The problem arises when using: When I m running eval_math_local.sh, it crashed and failed with the import error
A clear and concise description of what the bug is. Console Output: ...
MengAiDev
- 2
- Opened 29 days ago
- #1725
Thank you for publishing the BEiT v2 code! I’m pretraining BEiT v2 on a custom industrial dataset (1.1 M fault and
normal images for training; 200 k normal images for validation) and have a few questions: ...
IKnowWhoo
- 1
- Opened on Jul 4
- #1724
When will you open source the data synthesis code of paper Scaling Laws of Synthetic Data for Language Model
butterluo
- Opened on Jul 3
- #1722
Hi UniLM team,
The recent paper “Think Only When You Need with Large Hybrid-Reasoning Models” states that its code and models would be
released in this repository.
I’ve searched the repo (branches, ...
almogtavor
- 1
- Opened on May 31
- #1719
I have trained a 1.3B model using both the Differential Transformer and the standard Transformer. I observed a slight
improvement in LLM evaluation scores for the Differential Transformer variant, and ...
fasil-saidalavi
- 5
- Opened on May 20
- #1718
Hi, I found a potential bug in the textdiffuser-2/inference_textdiffuser2_t2i_full.py file.
current_ocr is overwritten as empty list before iteration (line 558)
In line 553, current_ocr is correctly ...
dogcdt
- Opened on May 16
- #1717
As I understand it, headdim is more important than the number of heads, and the diff transformer chooses to half the
number of heads and double the vdim compared to normal transformers.
However, wouldn ...
RuiWang1998
- 3
- Opened on May 13
- #1716
hello,I have a question about how to use wavlm to separate mix-audio.Is it possible to use the speech features extracted
by WAVLM as the output of certain speech separation networks (such as ConvTasNet) ...
yangwyy
- Opened on May 10
- #1715

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.
Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.