Releases: InternLM/InternLM
Releases · InternLM/InternLM
InternLM-v0.2.1dev20240102
What's Changed
- fix(timeout): larger timeout by @JiaoPL in #495
- feat(doc): add GPU memory info for 7B & 20B models by @li126com in #507
- feat(model): add rope_base interface by @00INDEX in #512
- Feat(QA): Check loss when swapping micro_num and micro_bsz && Check grad norm by @li126com in #510
- Fix(QA): the py name in main is wrong by @li126com in #514
- fix/feat: small fix and enhancement by @SolenoidWGT in #515
- test(workflow): add workflow for loss test and change trigger event by @kkscilife in #513
- fix(ci): fix test model ckpt ci test by @SolenoidWGT in #518
- test(workflow): add unit test case by @kkscilife in #524
- feat(storage): use multipart upload when using oss by @li126com in #520
- Fix (QA checkpoint): fix test_model_checkpoint singleton import by @li126com in #526
- fix(model): add IS_SEQUENCE_PARALLEL check for norm module by @yingtongxiong in #528
- feat(model): add output embedding tf32 option by @JiaoPL in #523
- feat(grad_norm): vocab grad norm profiling by @JiaoPL in #519
- fix(data): fix the unpack for type_ids when use_flash_attn=False by @yingtongxiong in #516
- fix(storage): unify the name of AK and SK by @li126com in #527
- fix(test): fix type_ids unpack bug by @SolenoidWGT in #530
- feat(model): support llama model with checkpoint loading by @li126com in #532
- fix(metric): add metric dtype control by @Pryest in #533
- feat(ckpt): support auto resume in Volc and Ali by @li126com in #529
- fix(sequence_parallel): fix norm all-reduce in seq_parallel when not overlaping by @yingtongxiong in #534
- fix(pp): fix no-packed dataset load micro batch error by @SolenoidWGT in #538
- fix(model): change model_type
LLAMA
toLLAMA2
by @li126com in #539 - fix(moe): fix moe zero mode bug by @blankde in #548
- fix(grad_norm): token grad norm with tp by @JiaoPL in #547
- test(workflow): change into reserved by @kkscilife in #550
- fix(model): add ckpt_type constraint when loading ckpts by @li126com in #542
- feat(logger): add tensorboard key value buffer by @SolenoidWGT in #549
- fix(metrics): remove redundant cuda memory in metric calculations by @SolenoidWGT in #557
- fix(lr_scheduler): fix when resuming lr_scheduler without loading optimizer by @gaoyang07 in #565
Full Changelog: v0.2.1dev20231121...v0.2.1dev20240102
InternLM-v0.2.1dev20230922
TBD
InternLM-v0.2.1dev20230915
Highlights
- fix the bug that may have grad overflow when total_steps is small
- fix the rotary_emb.inv_freq KeyError in tool convert2hf.py
- add unit test for model
What's Changed
🚀 Features
🐞 Bug fixes
- fix(convert2hf.py): fix the rotary_emb.inv_freq KeyError by @jiangtann in #299
- fix(configs/7B_sft.py): model dtype float16 to bfloat16 by @huangting4201 in #302
- fix(chat): fix stream_chat to return generator by @zhjunqin in #123
📚 Documentations
- docs(doc/code-docs): update quickstart usage by @huangting4201 in #301
- docs(doc/code-docs): add figure for training docs by @zigzagcai in #307
✅ Tests
- tests(tests/test_model): add unit test for model by @li126com in #300
- tests(tests/test_solver): add unit test for optimizer by @li126com in #303
🌐 Other
Known issues
Full Changelog: v0.2.1dev20230909...v0.2.1dev20230915
InternLM-v0.2.1dev20230909
What's Changed
- fix(ckpt): fix snapshot none load error and remove file lock by @SolenoidWGT in #298
Full Changelog: v0.2.1dev20230908...v0.2.1dev20230909
InternLM-v0.2.1dev20230908
Highlights
- fix the bug that may have NaN value when overlap gradients' allreduce with backward
- support timeout wrapper and runtime diagnosis
- support readthedocs Chinese version
What's Changed
🚀 Features
- feat(monitor): add light monitor by @JiaoPL in #275
- feat(utils): add timeout wrapper by @SolenoidWGT in #286
- feat: add runtime diagnosis by @sunpengsdu in #297
💥 Improvements
- fix(storage): refactor and fix storage_manager api by @SolenoidWGT in #281
- Feat/sync grad use async op by @sunpengsdu in #277
🐞 Bug fixes
- fix(doc/code-docs): autodoc shown error by @huangting4201 in #265
- fix(eval): no need to check length of valid_dl when using streaming dataset by @00INDEX in #274
- fix/broadcast should not in commu stream by @sunpengsdu in #276
- fix(model): set tensor parallel attribute for mlp by @yingtongxiong in #271
- feat(ckpt): checkpoint bug fixes and feature enhancements. by @SolenoidWGT in #259
- fix(ckpt): fix checkpoint reload bug by @SolenoidWGT in #282
- fix(core/context): use dummy mode to generate random numbers in model construction by @blankde in #266
- fix(monitor): add alert switch and refactor monitor config by @JiaoPL in #285
- fix: fix the bug to do bcast in a stream by @sunpengsdu in #294
📚 Documentations
- docs(*): add documentation and reST files for readthedocs by @zigzagcai in #272
- docs(doc/code-docs): support zh cn readthedocs by @huangting4201 in #289
- docs(fsdp): add training option for fsdp by @zaglc in #273
- docs(doc/code-docs): refine profiler docs by @zigzagcai in #295
🌐 Other
Known issues
New Contributors
- @JiaoPL made their first contribution in #275
- @blankde made their first contribution in #266
- @zigzagcai made their first contribution in #272
- @zaglc made their first contribution in #273
Full Changelog: v0.2.1dev20230901...v0.2.1dev20230908
InternLM-v0.2.1dev20230901
Highlights
- Support centos and ubuntu dockerfile
- Support runtime gpu flops and nccl allreduce speed test
What's Changed
🚀 Features
- Implement uniform_init for tensor by @Pryest in #252
- Support centos and ubuntu dockerfile by @li126com in #220 #243
- Support writer
add_scalars
for writing dict data by @huangting4201 in #257 - Support runtime gpu flops and nccl allreduce speed test by @sunpengsdu in #254
💥 Improvements
🐞 Bug fixes
- Fix StreamingDataset does not have an
len
method by @00INDEX in #251 - Fix argument missing in getting loss metrics by @MagicDevilZhang in #256
- Fix the error that RotaryEmbedding is converted to a non-fp32 format during operation by @YWMditto in #239
📚 Documentations
- Update readme structure by @huangting4201 in #240
- Support readthedocs by @huangting4201 in #245 #264
🌐 Other
Known issues
InternLM-v0.2.0
Features:
- Support pipeline parallel, including interleaved and non-interleaved pipeline scheduler.
- Support sequence parallel.
- Support model evaluation.
- Support tf32 with flash-attention.
- Support tensorboard writer for recording training performance metrics.
- Support customed uniscale logger.
- Support calculating model's accuracy and perplexity metrics.
- Support oss storage and checkpoint asynchronous uploading.
- Support automatically loading the latest checkpoint.
- Support checkpoint snapshot.
- Support monitoring the status of training jobs, and alarm abnormal status.
- Support torch profiler.
- Support simple memory profiler.
Optimizations:
- Overlapping optimizer parameters broadcast with model forward.
- Overlapping optimizer last bucket gradients allreduce with compute norm.
InternLM-v0.1.0
fix huggingface link (#219)