Skip to content

Latest commit

ย 

History

History
581 lines (501 loc) ยท 108 KB

README_ja.md

File metadata and controls

581 lines (501 loc) ยท 108 KB



Build GitHub Documentation GitHub release Contributor Covenant DOI

JAXใ€PyTorchใ€TensorFlowใฎใŸใ‚ใฎๆœ€ๅ…ˆ็ซฏๆฉŸๆขฐๅญฆ็ฟ’

๐Ÿค—Transformersใฏใ€ใƒ†ใ‚ญใ‚นใƒˆใ€่ฆ–่ฆšใ€้Ÿณๅฃฐใชใฉใฎ็•ฐใชใ‚‹ใƒขใƒ€ใƒชใƒ†ใ‚ฃใซๅฏพใ—ใฆใ‚ฟใ‚นใ‚ฏใ‚’ๅฎŸ่กŒใ™ใ‚‹ใŸใ‚ใซใ€ไบ‹ๅ‰ใซๅญฆ็ฟ’ใ•ใ›ใŸๆ•ฐๅƒใฎใƒขใƒ‡ใƒซใ‚’ๆไพ›ใ—ใพใ™ใ€‚

ใ“ใ‚Œใ‚‰ใฎใƒขใƒ‡ใƒซใฏๆฌกใฎใ‚ˆใ†ใชๅ ดๅˆใซ้ฉ็”จใงใใพใ™:

  • ๐Ÿ“ ใƒ†ใ‚ญใ‚นใƒˆใฏใ€ใƒ†ใ‚ญใ‚นใƒˆใฎๅˆ†้กžใ€ๆƒ…ๅ ฑๆŠฝๅ‡บใ€่ณชๅ•ๅฟœ็ญ”ใ€่ฆ็ด„ใ€็ฟป่จณใ€ใƒ†ใ‚ญใ‚นใƒˆ็”Ÿๆˆใชใฉใฎใ‚ฟใ‚นใ‚ฏใฎใŸใ‚ใซใ€100ไปฅไธŠใฎ่จ€่ชžใซๅฏพๅฟœใ—ใฆใ„ใพใ™ใ€‚
  • ๐Ÿ–ผ๏ธ ็”ปๅƒๅˆ†้กžใ€็‰ฉไฝ“ๆคœๅ‡บใ€ใ‚ปใ‚ฐใƒกใƒณใƒ†ใƒผใ‚ทใƒงใƒณใชใฉใฎใ‚ฟใ‚นใ‚ฏใฎใŸใ‚ใฎ็”ปๅƒใ€‚
  • ๐Ÿ—ฃ๏ธ ้Ÿณๅฃฐใฏใ€้Ÿณๅฃฐ่ช่ญ˜ใ‚„้Ÿณๅฃฐๅˆ†้กžใชใฉใฎใ‚ฟใ‚นใ‚ฏใซไฝฟ็”จใ—ใพใ™ใ€‚

ใƒˆใƒฉใƒณใ‚นใƒ•ใ‚ฉใƒผใƒžใƒผใƒขใƒ‡ใƒซใฏใ€ใƒ†ใƒผใƒ–ใƒซ่ณชๅ•ๅฟœ็ญ”ใ€ๅ…‰ๅญฆๆ–‡ๅญ—่ช่ญ˜ใ€ใ‚นใ‚ญใƒฃใƒณๆ–‡ๆ›ธใ‹ใ‚‰ใฎๆƒ…ๅ ฑๆŠฝๅ‡บใ€ใƒ“ใƒ‡ใ‚ชๅˆ†้กžใ€่ฆ–่ฆš็š„่ณชๅ•ๅฟœ็ญ”ใชใฉใ€่ค‡ๆ•ฐใฎใƒขใƒ€ใƒชใƒ†ใ‚ฃใ‚’็ต„ใฟๅˆใ‚ใ›ใŸใ‚ฟใ‚นใ‚ฏใ‚‚ๅฎŸ่กŒๅฏ่ƒฝใงใ™ใ€‚

๐Ÿค—Transformersใฏใ€ไธŽใˆใ‚‰ใ‚ŒใŸใƒ†ใ‚ญใ‚นใƒˆใซๅฏพใ—ใฆใใ‚Œใ‚‰ใฎไบ‹ๅ‰ๅญฆ็ฟ’ใ•ใ‚ŒใŸใƒขใƒ‡ใƒซใ‚’็ด ๆ—ฉใใƒ€ใ‚ฆใƒณใƒญใƒผใƒ‰ใ—ใฆไฝฟ็”จใ—ใ€ใ‚ใชใŸ่‡ช่บซใฎใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใงใใ‚Œใ‚‰ใ‚’ๅพฎ่ชฟๆ•ดใ—ใ€็งใŸใกใฎmodel hubใงใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใจๅ…ฑๆœ‰ใ™ใ‚‹ใŸใ‚ใฎAPIใ‚’ๆไพ›ใ—ใพใ™ใ€‚ๅŒๆ™‚ใซใ€ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃใ‚’ๅฎš็พฉใ™ใ‚‹ๅ„Pythonใƒขใ‚ธใƒฅใƒผใƒซใฏๅฎŒๅ…จใซใ‚นใ‚ฟใƒณใƒ‰ใ‚ขใƒญใƒณใงใ‚ใ‚Šใ€่ฟ…้€Ÿใช็ ”็ฉถๅฎŸ้จ“ใ‚’ๅฏ่ƒฝใซใ™ใ‚‹ใŸใ‚ใซๅค‰ๆ›ดใ™ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚

๐Ÿค—TransformersใฏJaxใ€PyTorchใ€TensorFlowใจใ„ใ†3ๅคงใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใƒฉใ‚คใƒ–ใƒฉใƒชใƒผใซๆ”ฏใˆใ‚‰ใ‚Œใ€ใใ‚Œใžใ‚Œใฎใƒฉใ‚คใƒ–ใƒฉใƒชใ‚’ใ‚ทใƒผใƒ ใƒฌใ‚นใซ็ตฑๅˆใ—ใฆใ„ใพใ™ใ€‚็‰‡ๆ–นใงใƒขใƒ‡ใƒซใ‚’ๅญฆ็ฟ’ใ—ใฆใ‹ใ‚‰ใ€ใ‚‚ใ†็‰‡ๆ–นใงๆŽจ่ซ–็”จใซใƒญใƒผใƒ‰ใ™ใ‚‹ใฎใฏ็ฐกๅ˜ใชใ“ใจใงใ™ใ€‚

ใ‚ชใƒณใƒฉใ‚คใƒณใƒ‡ใƒข

model hubใ‹ใ‚‰ใ€ใปใจใ‚“ใฉใฎใƒขใƒ‡ใƒซใฎใƒšใƒผใ‚ธใง็›ดๆŽฅใƒ†ใ‚นใƒˆใ™ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚ใพใŸใ€ใƒ‘ใƒ–ใƒชใƒƒใ‚ฏใƒขใƒ‡ใƒซใ€ใƒ—ใƒฉใ‚คใƒ™ใƒผใƒˆใƒขใƒ‡ใƒซใซๅฏพใ—ใฆใ€ใƒ—ใƒฉใ‚คใƒ™ใƒผใƒˆใƒขใƒ‡ใƒซใฎใƒ›ใ‚นใƒ†ใ‚ฃใƒณใ‚ฐใ€ใƒใƒผใ‚ธใƒงใƒ‹ใƒณใ‚ฐใ€ๆŽจ่ซ–APIใ‚’ๆไพ›ใ—ใฆใ„ใพใ™ใ€‚

ไปฅไธ‹ใฏใใฎไธ€ไพ‹ใงใ™:

่‡ช็„ถ่จ€่ชžๅ‡ฆ็†ใซใฆ:

ใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใƒ“ใ‚ธใƒงใƒณใซใฆ:

ใ‚ชใƒผใƒ‡ใ‚ฃใ‚ชใซใฆ:

ใƒžใƒซใƒใƒขใƒผใƒ€ใƒซใชใ‚ฟใ‚นใ‚ฏใซใฆ:

Hugging Faceใƒใƒผใƒ ใซใ‚ˆใฃใฆไฝœใ‚‰ใ‚ŒใŸ ใƒˆใƒฉใƒณใ‚นใƒ•ใ‚ฉใƒผใƒžใƒผใ‚’ไฝฟใฃใŸๆ›ธใ่พผใฟ ใฏใ€ใ“ใฎใƒชใƒใ‚ธใƒˆใƒชใฎใƒ†ใ‚ญใ‚นใƒˆ็”ŸๆˆๆฉŸ่ƒฝใฎๅ…ฌๅผใƒ‡ใƒขใงใ‚ใ‚‹ใ€‚

Hugging Faceใƒใƒผใƒ ใซใ‚ˆใ‚‹ใ‚ซใ‚นใ‚ฟใƒ ใƒปใ‚ตใƒใƒผใƒˆใ‚’ใ”ๅธŒๆœ›ใฎๅ ดๅˆ

HuggingFace Expert Acceleration Program

ใ‚ฏใ‚คใƒƒใ‚ฏใƒ„ใ‚ขใƒผ

ไธŽใˆใ‚‰ใ‚ŒใŸๅ…ฅๅŠ›๏ผˆใƒ†ใ‚ญใ‚นใƒˆใ€็”ปๅƒใ€้Ÿณๅฃฐใ€...๏ผ‰ใซๅฏพใ—ใฆใ™ใใซใƒขใƒ‡ใƒซใ‚’ไฝฟใ†ใŸใ‚ใซใ€ๆˆ‘ใ€…ใฏpipelineใจใ„ใ†APIใ‚’ๆไพ›ใ—ใฆใŠใ‚Šใพใ™ใ€‚pipelineใฏใ€ๅญฆ็ฟ’ๆธˆใฟใฎใƒขใƒ‡ใƒซใจใ€ใใฎใƒขใƒ‡ใƒซใฎๅญฆ็ฟ’ๆ™‚ใซไฝฟ็”จใ•ใ‚ŒใŸๅ‰ๅ‡ฆ็†ใ‚’ใ‚ฐใƒซใƒผใƒ—ๅŒ–ใ—ใŸใ‚‚ใฎใงใ™ใ€‚ไปฅไธ‹ใฏใ€่‚ฏๅฎš็š„ใชใƒ†ใ‚ญใ‚นใƒˆใจๅฆๅฎš็š„ใชใƒ†ใ‚ญใ‚นใƒˆใ‚’ๅˆ†้กžใ™ใ‚‹ใŸใ‚ใซpipelineใ‚’ไฝฟ็”จใ™ใ‚‹ๆ–นๆณ•ใงใ™:

>>> from transformers import pipeline

# Allocate a pipeline for sentiment-analysis
>>> classifier = pipeline('sentiment-analysis')
>>> classifier('We are very happy to introduce pipeline to the transformers repository.')
[{'label': 'POSITIVE', 'score': 0.9996980428695679}]

2่กŒ็›ฎใฎใ‚ณใƒผใƒ‰ใงใฏใ€pipelineใงไฝฟ็”จใ•ใ‚Œใ‚‹ไบ‹ๅ‰ๅญฆ็ฟ’ๆธˆใฟใƒขใƒ‡ใƒซใ‚’ใƒ€ใ‚ฆใƒณใƒญใƒผใƒ‰ใ—ใฆใ‚ญใƒฃใƒƒใ‚ทใƒฅใ—ใ€3่กŒ็›ฎใงใฏไธŽใˆใ‚‰ใ‚ŒใŸใƒ†ใ‚ญใ‚นใƒˆใซๅฏพใ—ใฆใใฎใƒขใƒ‡ใƒซใ‚’่ฉ•ไพกใ—ใพใ™ใ€‚ใ“ใ“ใงใฏใ€็ญ”ใˆใฏ99.97%ใฎไฟก้ ผๅบฆใงใ€Œใƒใ‚ธใƒ†ใ‚ฃใƒ–ใ€ใงใ™ใ€‚

่‡ช็„ถ่จ€่ชžๅ‡ฆ็†ใ ใ‘ใงใชใใ€ใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใƒ“ใ‚ธใƒงใƒณใ‚„้Ÿณๅฃฐๅ‡ฆ็†ใซใŠใ„ใฆใ‚‚ใ€ๅคšใใฎใ‚ฟใ‚นใ‚ฏใซใฏใ‚ใ‚‰ใ‹ใ˜ใ‚่จ“็ทดใ•ใ‚ŒใŸpipelineใŒ็”จๆ„ใ•ใ‚Œใฆใ„ใ‚‹ใ€‚ไพ‹ใˆใฐใ€็”ปๅƒใ‹ใ‚‰ๆคœๅ‡บใ•ใ‚ŒใŸ็‰ฉไฝ“ใ‚’็ฐกๅ˜ใซๆŠฝๅ‡บใ™ใ‚‹ใ“ใจใŒใงใใ‚‹:

>>> import requests
>>> from PIL import Image
>>> from transformers import pipeline

# Download an image with cute cats
>>> url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png"
>>> image_data = requests.get(url, stream=True).raw
>>> image = Image.open(image_data)

# Allocate a pipeline for object detection
>>> object_detector = pipeline('object-detection')
>>> object_detector(image)
[{'score': 0.9982201457023621,
  'label': 'remote',
  'box': {'xmin': 40, 'ymin': 70, 'xmax': 175, 'ymax': 117}},
 {'score': 0.9960021376609802,
  'label': 'remote',
  'box': {'xmin': 333, 'ymin': 72, 'xmax': 368, 'ymax': 187}},
 {'score': 0.9954745173454285,
  'label': 'couch',
  'box': {'xmin': 0, 'ymin': 1, 'xmax': 639, 'ymax': 473}},
 {'score': 0.9988006353378296,
  'label': 'cat',
  'box': {'xmin': 13, 'ymin': 52, 'xmax': 314, 'ymax': 470}},
 {'score': 0.9986783862113953,
  'label': 'cat',
  'box': {'xmin': 345, 'ymin': 23, 'xmax': 640, 'ymax': 368}}]

ใ“ใ“ใงใฏใ€็”ปๅƒใ‹ใ‚‰ๆคœๅ‡บใ•ใ‚ŒใŸใ‚ชใƒ–ใ‚ธใ‚งใ‚ฏใƒˆใฎใƒชใ‚นใƒˆใŒๅพ—ใ‚‰ใ‚Œใ€ใ‚ชใƒ–ใ‚ธใ‚งใ‚ฏใƒˆใ‚’ๅ›ฒใ‚€ใƒœใƒƒใ‚ฏใ‚นใจไฟก้ ผๅบฆใ‚นใ‚ณใ‚ขใŒ่กจ็คบใ•ใ‚Œใพใ™ใ€‚ๅทฆๅดใŒๅ…ƒ็”ปๅƒใ€ๅณๅดใŒไบˆๆธฌ็ตๆžœใ‚’่กจ็คบใ—ใŸใ‚‚ใฎใงใ™:

ใ“ใฎใƒใƒฅใƒผใƒˆใƒชใ‚ขใƒซใงใฏใ€pipelineAPIใงใ‚ตใƒใƒผใƒˆใ•ใ‚Œใฆใ„ใ‚‹ใ‚ฟใ‚นใ‚ฏใซใคใ„ใฆ่ฉณใ—ใ่ชฌๆ˜Žใ—ใฆใ„ใพใ™ใ€‚

pipelineใซๅŠ ใˆใฆใ€ไธŽใˆใ‚‰ใ‚ŒใŸใ‚ฟใ‚นใ‚ฏใซๅญฆ็ฟ’ๆธˆใฟใฎใƒขใƒ‡ใƒซใ‚’ใƒ€ใ‚ฆใƒณใƒญใƒผใƒ‰ใ—ใฆไฝฟ็”จใ™ใ‚‹ใŸใ‚ใซๅฟ…่ฆใชใฎใฏใ€3่กŒใฎใ‚ณใƒผใƒ‰ใ ใ‘ใงใ™ใ€‚ไปฅไธ‹ใฏPyTorchใฎใƒใƒผใ‚ธใƒงใƒณใงใ™:

>>> from transformers import AutoTokenizer, AutoModel

>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
>>> model = AutoModel.from_pretrained("bert-base-uncased")

>>> inputs = tokenizer("Hello world!", return_tensors="pt")
>>> outputs = model(**inputs)

ใใ—ใฆใ“ใกใ‚‰ใฏTensorFlowใจๅŒ็ญ‰ใฎใ‚ณใƒผใƒ‰ใจใชใ‚Šใพใ™:

>>> from transformers import AutoTokenizer, TFAutoModel

>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
>>> model = TFAutoModel.from_pretrained("bert-base-uncased")

>>> inputs = tokenizer("Hello world!", return_tensors="tf")
>>> outputs = model(**inputs)

ใƒˆใƒผใ‚ฏใƒŠใ‚คใ‚ถใฏๅญฆ็ฟ’ๆธˆใฟใƒขใƒ‡ใƒซใŒๆœŸๅพ…ใ™ใ‚‹ใ™ในใฆใฎๅ‰ๅ‡ฆ็†ใ‚’ๆ‹…ๅฝ“ใ—ใ€ๅ˜ไธ€ใฎๆ–‡ๅญ—ๅˆ— (ไธŠ่จ˜ใฎไพ‹ใฎใ‚ˆใ†ใซ) ใพใŸใฏใƒชใ‚นใƒˆใซๅฏพใ—ใฆ็›ดๆŽฅๅ‘ผใณๅ‡บใ™ใ“ใจใŒใงใใพใ™ใ€‚ใ“ใ‚Œใฏไธ‹ๆตใฎใ‚ณใƒผใƒ‰ใงไฝฟ็”จใงใใ‚‹่พžๆ›ธใ‚’ๅ‡บๅŠ›ใ—ใพใ™ใ€‚ใพใŸใ€ๅ˜็ด”ใซ ** ๅผ•ๆ•ฐๅฑ•้–‹ๆผ”็ฎ—ๅญใ‚’ไฝฟ็”จใ—ใฆใƒขใƒ‡ใƒซใซ็›ดๆŽฅๆธกใ™ใ“ใจใ‚‚ใงใใพใ™ใ€‚

ใƒขใƒ‡ใƒซ่‡ชไฝ“ใฏ้€šๅธธใฎPytorch nn.Module ใพใŸใฏ TensorFlow tf.keras.Model (ใƒใƒƒใ‚ฏใ‚จใƒณใƒ‰ใซใ‚ˆใฃใฆ็•ฐใชใ‚‹)ใงใ€้€šๅธธ้€šใ‚Šไฝฟ็”จใ™ใ‚‹ใ“ใจใŒๅฏ่ƒฝใงใ™ใ€‚ใ“ใฎใƒใƒฅใƒผใƒˆใƒชใ‚ขใƒซใงใฏใ€ใ“ใฎใ‚ˆใ†ใชใƒขใƒ‡ใƒซใ‚’ๅพ“ๆฅใฎPyTorchใ‚„TensorFlowใฎๅญฆ็ฟ’ใƒซใƒผใƒ—ใซ็ตฑๅˆใ™ใ‚‹ๆ–นๆณ•ใ‚„ใ€็งใŸใกใฎTrainerAPIใ‚’ไฝฟใฃใฆๆ–ฐใ—ใ„ใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใง็ด ๆ—ฉใๅพฎ่ชฟๆ•ดใ‚’่กŒใ†ๆ–นๆณ•ใซใคใ„ใฆ่ชฌๆ˜Žใ—ใพใ™ใ€‚

ใชใœtransformersใ‚’ไฝฟใ†ๅฟ…่ฆใŒใ‚ใ‚‹ใฎใงใ—ใ‚‡ใ†ใ‹๏ผŸ

  1. ไฝฟใ„ใ‚„ใ™ใ„ๆœ€ๆ–ฐใƒขใƒ‡ใƒซ:

    • ่‡ช็„ถ่จ€่ชž็†่งฃใƒป็”Ÿๆˆใ€ใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใƒ“ใ‚ธใƒงใƒณใ€ใ‚ชใƒผใƒ‡ใ‚ฃใ‚ชใฎๅ„ใ‚ฟใ‚นใ‚ฏใง้ซ˜ใ„ใƒ‘ใƒ•ใ‚ฉใƒผใƒžใƒณใ‚นใ‚’็™บๆฎใ—ใพใ™ใ€‚
    • ๆ•™่‚ฒ่€…ใ€ๅฎŸๅ‹™่€…ใซใจใฃใฆใฎไฝŽใ„ๅ‚ๅ…ฅ้šœๅฃใ€‚
    • ๅญฆ็ฟ’ใ™ใ‚‹ใ‚ฏใƒฉใ‚นใฏ3ใคใ ใ‘ใงใ€ใƒฆใƒผใ‚ถใŒ็›ด้ขใ™ใ‚‹ๆŠฝ่ฑกๅŒ–ใฏใปใจใ‚“ใฉใ‚ใ‚Šใพใ›ใ‚“ใ€‚
    • ๅญฆ็ฟ’ๆธˆใฟใƒขใƒ‡ใƒซใ‚’ๅˆฉ็”จใ™ใ‚‹ใŸใ‚ใฎ็ตฑไธ€ใ•ใ‚ŒใŸAPIใ€‚
  2. ไฝŽใ„่จˆ็ฎ—ใ‚ณใ‚นใƒˆใ€ๅฐ‘ใชใ„ใ‚ซใƒผใƒœใƒณใƒ•ใƒƒใƒˆใƒ—ใƒชใƒณใƒˆ:

    • ็ ”็ฉถ่€…ใฏใ€ๅธธใซๅ†ใƒˆใƒฌใƒผใƒ‹ใƒณใ‚ฐใ‚’่กŒใ†ใฎใงใฏใชใใ€ใƒˆใƒฌใƒผใƒ‹ใƒณใ‚ฐใ•ใ‚ŒใŸใƒขใƒ‡ใƒซใ‚’ๅ…ฑๆœ‰ใ™ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚
    • ๅฎŸๅ‹™ๅฎถใฏใ€่จˆ็ฎ—ๆ™‚้–“ใ‚„็”Ÿ็”ฃใ‚ณใ‚นใƒˆใ‚’ๅ‰Šๆธ›ใ™ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚
    • ใ™ในใฆใฎใƒขใƒ€ใƒชใƒ†ใ‚ฃใซใŠใ„ใฆใ€60,000ไปฅไธŠใฎไบ‹ๅ‰ๅญฆ็ฟ’ๆธˆใฟใƒขใƒ‡ใƒซใ‚’ๆŒใคๆ•ฐๅคšใใฎใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃใ‚’ๆไพ›ใ—ใพใ™ใ€‚
  3. ใƒขใƒ‡ใƒซใฎใƒฉใ‚คใƒ•ใ‚ฟใ‚คใƒ ใฎใ‚ใ‚‰ใ‚†ใ‚‹้ƒจๅˆ†ใง้ฉๅˆ‡ใชใƒ•ใƒฌใƒผใƒ ใƒฏใƒผใ‚ฏใ‚’้ธๆŠžๅฏ่ƒฝ:

    • 3่กŒใฎใ‚ณใƒผใƒ‰ใงๆœ€ๅ…ˆ็ซฏใฎใƒขใƒ‡ใƒซใ‚’ใƒˆใƒฌใƒผใƒ‹ใƒณใ‚ฐใ€‚
    • TF2.0/PyTorch/JAXใƒ•ใƒฌใƒผใƒ ใƒฏใƒผใ‚ฏ้–“ใง1ใคใฎใƒขใƒ‡ใƒซใ‚’่‡ชๅœจใซ็งปๅ‹•ใ•ใ›ใ‚‹ใ€‚
    • ๅญฆ็ฟ’ใ€่ฉ•ไพกใ€็”Ÿ็”ฃใซ้ฉใ—ใŸใƒ•ใƒฌใƒผใƒ ใƒฏใƒผใ‚ฏใ‚’ใ‚ทใƒผใƒ ใƒฌใ‚นใซ้ธๆŠžใงใใพใ™ใ€‚
  4. ใƒขใƒ‡ใƒซใ‚„ใ‚ตใƒณใƒ—ใƒซใ‚’ใƒ‹ใƒผใ‚บใซๅˆใ‚ใ›ใฆ็ฐกๅ˜ใซใ‚ซใ‚นใ‚ฟใƒžใ‚คใ‚บๅฏ่ƒฝ:

    • ๅŽŸ่‘—่€…ใŒ็™บ่กจใ—ใŸ็ตๆžœใ‚’ๅ†็พใ™ใ‚‹ใŸใ‚ใซใ€ๅ„ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃใฎไพ‹ใ‚’ๆไพ›ใ—ใฆใ„ใพใ™ใ€‚
    • ใƒขใƒ‡ใƒซๅ†…้ƒจใฏๅฏ่ƒฝใช้™ใ‚Šไธ€่ฒซใ—ใฆๅ…ฌ้–‹ใ•ใ‚Œใฆใ„ใพใ™ใ€‚
    • ใƒขใƒ‡ใƒซใƒ•ใ‚กใ‚คใƒซใฏใƒฉใ‚คใƒ–ใƒฉใƒชใจใฏ็‹ฌ็ซ‹ใ—ใฆๅˆฉ็”จใ™ใ‚‹ใ“ใจใŒใงใใ€่ฟ…้€ŸใชๅฎŸ้จ“ใŒๅฏ่ƒฝใงใ™ใ€‚

ใชใœtransformersใ‚’ไฝฟใฃใฆใฏใ„ใ‘ใชใ„ใฎใงใ—ใ‚‡ใ†ใ‹๏ผŸ

  • ใ“ใฎใƒฉใ‚คใƒ–ใƒฉใƒชใฏใ€ใƒ‹ใƒฅใƒผใƒฉใƒซใƒใƒƒใƒˆใฎใŸใ‚ใฎใƒ“ใƒซใƒ‡ใ‚ฃใƒณใ‚ฐใƒ–ใƒญใƒƒใ‚ฏใฎใƒขใ‚ธใƒฅใƒผใƒซๅผใƒ„ใƒผใƒซใƒœใƒƒใ‚ฏใ‚นใงใฏใ‚ใ‚Šใพใ›ใ‚“ใ€‚ใƒขใƒ‡ใƒซใƒ•ใ‚กใ‚คใƒซใฎใ‚ณใƒผใƒ‰ใฏใ€็ ”็ฉถ่€…ใŒ่ฟฝๅŠ ใฎๆŠฝ่ฑกๅŒ–/ใƒ•ใ‚กใ‚คใƒซใซ้ฃ›ใณ่พผใ‚€ใ“ใจใชใใ€ๅ„ใƒขใƒ‡ใƒซใ‚’็ด ๆ—ฉใๅๅพฉใงใใ‚‹ใ‚ˆใ†ใซใ€ๆ„ๅ›ณ็š„ใซ่ฟฝๅŠ ใฎๆŠฝ่ฑกๅŒ–ใงใƒชใƒ•ใ‚กใ‚ฏใ‚ฟใƒชใƒณใ‚ฐใ•ใ‚Œใฆใ„ใพใ›ใ‚“ใ€‚
  • ๅญฆ็ฟ’APIใฏใฉใฎใ‚ˆใ†ใชใƒขใƒ‡ใƒซใงใ‚‚ๅ‹•ไฝœใ™ใ‚‹ใ‚ใ‘ใงใฏใชใใ€ใƒฉใ‚คใƒ–ใƒฉใƒชใŒๆไพ›ใ™ใ‚‹ใƒขใƒ‡ใƒซใงๅ‹•ไฝœใ™ใ‚‹ใ‚ˆใ†ใซๆœ€้ฉๅŒ–ใ•ใ‚Œใฆใ„ใพใ™ใ€‚ไธ€่ˆฌ็š„ใชๆฉŸๆขฐๅญฆ็ฟ’ใฎใƒซใƒผใƒ—ใซใฏใ€ๅˆฅใฎใƒฉใ‚คใƒ–ใƒฉใƒช(ใŠใใ‚‰ใAccelerate)ใ‚’ไฝฟ็”จใ™ใ‚‹ๅฟ…่ฆใŒใ‚ใ‚Šใพใ™ใ€‚
  • ็งใŸใกใฏใงใใ‚‹ใ ใ‘ๅคšใใฎไฝฟ็”จไพ‹ใ‚’็ดนไป‹ใ™ใ‚‹ใ‚ˆใ†ๅŠชๅŠ›ใ—ใฆใ„ใพใ™ใŒใ€examples ใƒ•ใ‚ฉใƒซใƒ€ ใซใ‚ใ‚‹ใ‚นใ‚ฏใƒชใƒ—ใƒˆใฏใ‚ใใพใงไพ‹ใงใ™ใ€‚ใ‚ใชใŸใฎ็‰นๅฎšใฎๅ•้กŒใซๅฏพใ—ใฆใ™ใใซๅ‹•ไฝœใ™ใ‚‹ใ‚ใ‘ใงใฏใชใใ€ใ‚ใชใŸใฎใƒ‹ใƒผใ‚บใซๅˆใ‚ใ›ใ‚‹ใŸใ‚ใซๆ•ฐ่กŒใฎใ‚ณใƒผใƒ‰ใ‚’ๅค‰ๆ›ดใ™ใ‚‹ๅฟ…่ฆใŒใ‚ใ‚‹ใ“ใจใŒไบˆๆƒณใ•ใ‚Œใพใ™ใ€‚

ใ‚คใƒณใ‚นใƒˆใƒผใƒซ

pipใซใฆ

ใ“ใฎใƒชใƒใ‚ธใƒˆใƒชใฏใ€Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, TensorFlow 2.6+ ใงใƒ†ใ‚นใƒˆใ•ใ‚Œใฆใ„ใพใ™ใ€‚

๐Ÿค—Transformersใฏไปฎๆƒณ็’ฐๅขƒใซใ‚คใƒณใ‚นใƒˆใƒผใƒซใ™ใ‚‹ๅฟ…่ฆใŒใ‚ใ‚Šใพใ™ใ€‚Pythonใฎไปฎๆƒณ็’ฐๅขƒใซๆ…ฃใ‚Œใฆใ„ใชใ„ๅ ดๅˆใฏใ€ใƒฆใƒผใ‚ถใƒผใ‚ฌใ‚คใƒ‰ใ‚’็ขบ่ชใ—ใฆใใ ใ•ใ„ใ€‚

ใพใšใ€ไฝฟ็”จใ™ใ‚‹ใƒใƒผใ‚ธใƒงใƒณใฎPythonใงไปฎๆƒณ็’ฐๅขƒใ‚’ไฝœๆˆใ—ใ€ใ‚ขใ‚ฏใƒ†ใ‚ฃใƒ™ใƒผใƒˆใ—ใพใ™ใ€‚

ใใฎๅพŒใ€Flax, PyTorch, TensorFlowใฎใ†ใกๅฐ‘ใชใใจใ‚‚1ใคใ‚’ใ‚คใƒณใ‚นใƒˆใƒผใƒซใ™ใ‚‹ๅฟ…่ฆใŒใ‚ใ‚Šใพใ™ใ€‚ TensorFlowใ‚คใƒณใ‚นใƒˆใƒผใƒซใƒšใƒผใ‚ธใ€PyTorchใ‚คใƒณใ‚นใƒˆใƒผใƒซใƒšใƒผใ‚ธใ€Flaxใ€Jaxใ‚คใƒณใ‚นใƒˆใƒผใƒซใƒšใƒผใ‚ธใงใ€ใŠไฝฟใ„ใฎใƒ—ใƒฉใƒƒใƒˆใƒ•ใ‚ฉใƒผใƒ ๅˆฅใฎใ‚คใƒณใ‚นใƒˆใƒผใƒซใ‚ณใƒžใƒณใƒ‰ใ‚’ๅ‚็…งใ—ใฆใใ ใ•ใ„ใ€‚

ใ“ใ‚Œใ‚‰ใฎใƒใƒƒใ‚ฏใ‚จใƒณใƒ‰ใฎใ„ใšใ‚Œใ‹ใŒใ‚คใƒณใ‚นใƒˆใƒผใƒซใ•ใ‚Œใฆใ„ใ‚‹ๅ ดๅˆใ€๐Ÿค—Transformersใฏไปฅไธ‹ใฎใ‚ˆใ†ใซpipใ‚’ไฝฟ็”จใ—ใฆใ‚คใƒณใ‚นใƒˆใƒผใƒซใ™ใ‚‹ใ“ใจใŒใงใใพใ™:

pip install transformers

ใ‚‚ใ—ใ‚ตใƒณใƒ—ใƒซใ‚’่ฉฆใ—ใŸใ„ใ€ใพใŸใฏใ‚ณใƒผใƒ‰ใฎๆœ€ๅ…ˆ็ซฏใŒๅฟ…่ฆใงใ€ๆ–ฐใ—ใ„ใƒชใƒชใƒผใ‚นใ‚’ๅพ…ใฆใชใ„ๅ ดๅˆใฏใ€ใƒฉใ‚คใƒ–ใƒฉใƒชใ‚’ใ‚ฝใƒผใ‚นใ‹ใ‚‰ใ‚คใƒณใ‚นใƒˆใƒผใƒซใ™ใ‚‹ๅฟ…่ฆใŒใ‚ใ‚Šใพใ™ใ€‚

condaใซใฆ

๐Ÿค—Transformersใฏไปฅไธ‹ใฎใ‚ˆใ†ใซcondaใ‚’ไฝฟใฃใฆ่จญ็ฝฎใ™ใ‚‹ใ“ใจใŒใงใใพใ™:

conda install conda-forge::transformers

ๆณจๆ„: huggingface ใƒใƒฃใƒณใƒใƒซใ‹ใ‚‰ transformers ใ‚’ใ‚คใƒณใ‚นใƒˆใƒผใƒซใ™ใ‚‹ใ“ใจใฏ้žๆŽจๅฅจใงใ™ใ€‚

Flaxใ€PyTorchใ€TensorFlowใ‚’condaใงใ‚คใƒณใ‚นใƒˆใƒผใƒซใ™ใ‚‹ๆ–นๆณ•ใฏใ€ใใ‚Œใžใ‚Œใฎใ‚คใƒณใ‚นใƒˆใƒผใƒซใƒšใƒผใ‚ธใซๅพ“ใฃใฆใใ ใ•ใ„ใ€‚

ๆณจๆ„: Windowsใงใฏใ€ใ‚ญใƒฃใƒƒใ‚ทใƒฅใฎๆฉๆตใ‚’ๅ—ใ‘ใ‚‹ใŸใ‚ใซใ€ใƒ‡ใƒ™ใƒญใƒƒใƒ‘ใƒผใƒขใƒผใƒ‰ใ‚’ๆœ‰ๅŠนใซใ™ใ‚‹ใ‚ˆใ†ไฟƒใ•ใ‚Œใ‚‹ใ“ใจใŒใ‚ใ‚Šใพใ™ใ€‚ใ“ใฎใ‚ˆใ†ใชๅ ดๅˆใฏใ€ใ“ใฎissueใงใŠ็Ÿฅใ‚‰ใ›ใใ ใ•ใ„ใ€‚

ใƒขใƒ‡ใƒซใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃ

๐Ÿค—TransformersใŒๆไพ›ใ™ใ‚‹ ๅ…จใƒขใƒ‡ใƒซใƒใ‚งใƒƒใ‚ฏใƒใ‚คใƒณใƒˆ ใฏใ€ใƒฆใƒผใ‚ถใƒผใ‚„็ต„็น”ใซใ‚ˆใฃใฆ็›ดๆŽฅใ‚ขใƒƒใƒ—ใƒญใƒผใƒ‰ใ•ใ‚Œใ‚‹huggingface.co model hubใ‹ใ‚‰ใ‚ทใƒผใƒ ใƒฌใ‚นใซ็ตฑๅˆใ•ใ‚Œใฆใ„ใพใ™ใ€‚

็พๅœจใฎใƒใ‚งใƒƒใ‚ฏใƒใ‚คใƒณใƒˆๆ•ฐ:

๐Ÿค—Transformersใฏ็พๅœจใ€ไปฅไธ‹ใฎใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃใ‚’ๆไพ›ใ—ใฆใ„ใพใ™๏ผˆใใ‚Œใžใ‚Œใฎใƒใ‚คใƒฌใƒ™ใƒซใช่ฆ็ด„ใฏใ“ใกใ‚‰ใ‚’ๅ‚็…งใ—ใฆใใ ใ•ใ„๏ผ‰:

  1. ALBERT (Google Research and the Toyota Technological Institute at Chicago ใ‹ใ‚‰) Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
  2. ALIGN (Google Research ใ‹ใ‚‰) Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
  3. AltCLIP (BAAI ใ‹ใ‚‰) Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
  4. Audio Spectrogram Transformer (MIT ใ‹ใ‚‰) Yuan Gong, Yu-An Chung, James Glass ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: AST: Audio Spectrogram Transformer
  5. Autoformer (from Tsinghua University) released with the paper Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting by Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long.
  6. Bark (from Suno) released in the repository suno-ai/bark by Suno AI team.
  7. BART (Facebook ใ‹ใ‚‰) Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
  8. BARThez (ร‰cole polytechnique ใ‹ใ‚‰) Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
  9. BARTpho (VinAI Research ใ‹ใ‚‰) Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
  10. BEiT (Microsoft ใ‹ใ‚‰) Hangbo Bao, Li Dong, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BEiT: BERT Pre-Training of Image Transformers
  11. BERT (Google ใ‹ใ‚‰) Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  12. BERT For Sequence Generation (Google ใ‹ใ‚‰) Sascha Rothe, Shashi Narayan, Aliaksei Severyn ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
  13. BERTweet (VinAI Research ใ‹ใ‚‰) Dat Quoc Nguyen, Thanh Vu and Anh Tuan Nguyen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BERTweet: A pre-trained language model for English Tweets
  14. BigBird-Pegasus (Google Research ใ‹ใ‚‰) Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Big Bird: Transformers for Longer Sequences
  15. BigBird-RoBERTa (Google Research ใ‹ใ‚‰) Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Big Bird: Transformers for Longer Sequences
  16. BioGpt (Microsoft Research AI4Science ใ‹ใ‚‰) Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon and Tie-Yan Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BioGPT: generative pre-trained transformer for biomedical text generation and mining
  17. BiT (Google AI ใ‹ใ‚‰) Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Big Transfer (BiT)Houlsby.
  18. Blenderbot (Facebook ใ‹ใ‚‰) Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Recipes for building an open-domain chatbot
  19. BlenderbotSmall (Facebook ใ‹ใ‚‰) Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Recipes for building an open-domain chatbot
  20. BLIP (Salesforce ใ‹ใ‚‰) Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
  21. BLIP-2 (Salesforce ใ‹ใ‚‰) Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
  22. BLOOM (BigScience workshop ใ‹ใ‚‰) BigScience Workshop ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚Œใพใ—ใŸ.
  23. BORT (Alexa ใ‹ใ‚‰) Adrian de Wynter and Daniel J. Perry ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Optimal Subarchitecture Extraction For BERT
  24. BridgeTower (Harbin Institute of Technology/Microsoft Research Asia/Intel Labs ใ‹ใ‚‰) released with the paper BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning by Xiao Xu, Chenfei Wu, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan.
  25. BROS (NAVER CLOVA ใ‹ใ‚‰) Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
  26. ByT5 (Google Research ใ‹ใ‚‰) Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ByT5: Towards a token-free future with pre-trained byte-to-byte models
  27. CamemBERT (Inria/Facebook/Sorbonne ใ‹ใ‚‰) Louis Martin*, Benjamin Muller*, Pedro Javier Ortiz Suรกrez*, Yoann Dupont, Laurent Romary, ร‰ric Villemonte de la Clergerie, Djamรฉ Seddah and Benoรฎt Sagot ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: CamemBERT: a Tasty French Language Model
  28. CANINE (Google Research ใ‹ใ‚‰) Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
  29. Chinese-CLIP (OFA-Sys ใ‹ใ‚‰) An Yang, Junshu Pan, Junyang Lin, Rui Men, Yichang Zhang, Jingren Zhou, Chang Zhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
  30. CLAP (LAION-AI ใ‹ใ‚‰) Yusong Wu, Ke Chen, Tianyu Zhang, Yuchen Hui, Taylor Berg-Kirkpatrick, Shlomo Dubnov. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
  31. CLIP (OpenAI ใ‹ใ‚‰) Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Learning Transferable Visual Models From Natural Language Supervision
  32. CLIPSeg (University of Gรถttingen ใ‹ใ‚‰) Timo Lรผddecke and Alexander Ecker ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Image Segmentation Using Text and Image Prompts
  33. CLVP released with the paper Better speech synthesis through scaling by James Betker.
  34. CodeGen (Salesforce ใ‹ใ‚‰) Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: A Conversational Paradigm for Program Synthesis
  35. CodeLlama (MetaAI ใ‹ใ‚‰) Baptiste Roziรจre, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, Jรฉrรฉmy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Dรฉfossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Code Llama: Open Foundation Models for Code
  36. Conditional DETR (Microsoft Research Asia ใ‹ใ‚‰) Depu Meng, Xiaokang Chen, Zejia Fan, Gang Zeng, Houqiang Li, Yuhui Yuan, Lei Sun, Jingdong Wang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Conditional DETR for Fast Training Convergence
  37. ConvBERT (YituTech ใ‹ใ‚‰) Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ConvBERT: Improving BERT with Span-based Dynamic Convolution
  38. ConvNeXT (Facebook AI ใ‹ใ‚‰) Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: A ConvNet for the 2020s
  39. ConvNeXTV2 (from Facebook AI) released with the paper ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders by Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie.
  40. CPM (Tsinghua University ใ‹ใ‚‰) Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: CPM: A Large-scale Generative Chinese Pre-trained Language Model
  41. CPM-Ant (OpenBMB ใ‹ใ‚‰) OpenBMB ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚Œใพใ—ใŸ.
  42. CTRL (Salesforce ใ‹ใ‚‰) Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: CTRL: A Conditional Transformer Language Model for Controllable Generation
  43. CvT (Microsoft ใ‹ใ‚‰) Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: CvT: Introducing Convolutions to Vision Transformers
  44. Data2Vec (Facebook ใ‹ใ‚‰) Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
  45. DeBERTa (Microsoft ใ‹ใ‚‰) Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: DeBERTa: Decoding-enhanced BERT with Disentangled Attention
  46. DeBERTa-v2 (Microsoft ใ‹ใ‚‰) Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: DeBERTa: Decoding-enhanced BERT with Disentangled Attention
  47. Decision Transformer (Berkeley/Facebook/Google ใ‹ใ‚‰) Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Decision Transformer: Reinforcement Learning via Sequence Modeling
  48. Deformable DETR (SenseTime Research ใ‹ใ‚‰) Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Deformable DETR: Deformable Transformers for End-to-End Object Detection
  49. DeiT (Facebook ใ‹ใ‚‰) Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervรฉ Jรฉgou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Training data-efficient image transformers & distillation through attention
  50. DePlot (Google AI ใ‹ใ‚‰) Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ DePlot: One-shot visual language reasoning by plot-to-table translation
  51. DETA (The University of Texas at Austin ใ‹ใ‚‰) Jeffrey Ouyang-Zhang, Jang Hyun Cho, Xingyi Zhou, Philipp Krรคhenbรผhl. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ NMS Strikes Back
  52. DETR (Facebook ใ‹ใ‚‰) Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: End-to-End Object Detection with Transformers
  53. DialoGPT (Microsoft Research ใ‹ใ‚‰) Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
  54. DiNAT (SHI Labs ใ‹ใ‚‰) Ali Hassani and Humphrey Shi ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Dilated Neighborhood Attention Transformer
  55. DINOv2 (Meta AI ใ‹ใ‚‰) Maxime Oquab, Timothรฉe Darcet, Thรฉo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervรฉ Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ DINOv2: Learning Robust Visual Features without Supervision
  56. DistilBERT (HuggingFace ใ‹ใ‚‰), Victor Sanh, Lysandre Debut and Thomas Wolf. ๅŒใ˜ๆ‰‹ๆณ•ใง GPT2, RoBERTa ใจ Multilingual BERT ใฎๅœง็ธฎใ‚’่กŒใ„ใพใ—ใŸ.ๅœง็ธฎใ•ใ‚ŒใŸใƒขใƒ‡ใƒซใฏใใ‚Œใžใ‚Œ DistilGPT2ใ€DistilRoBERTaใ€DistilmBERT ใจๅไป˜ใ‘ใ‚‰ใ‚Œใพใ—ใŸ. ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
  57. DiT (Microsoft Research ใ‹ใ‚‰) Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: DiT: Self-supervised Pre-training for Document Image Transformer
  58. Donut (NAVER ใ‹ใ‚‰), Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: OCR-free Document Understanding Transformer
  59. DPR (Facebook ใ‹ใ‚‰) Vladimir Karpukhin, Barlas OฤŸuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Dense Passage Retrieval for Open-Domain Question Answering
  60. DPT (Intel Labs ใ‹ใ‚‰) Renรฉ Ranftl, Alexey Bochkovskiy, Vladlen Koltun ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Vision Transformers for Dense Prediction
  61. EfficientFormer (Snap Research ใ‹ใ‚‰) Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ EfficientFormer: Vision Transformers at MobileNetSpeed
  62. EfficientNet (from Google Brain) released with the paper EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks by Mingxing Tan, Quoc V. Le.
  63. ELECTRA (Google Research/Stanford University ใ‹ใ‚‰) Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ELECTRA: Pre-training text encoders as discriminators rather than generators
  64. EnCodec (Meta AI ใ‹ใ‚‰) Alexandre Dรฉfossez, Jade Copet, Gabriel Synnaeve, Yossi Adi. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ High Fidelity Neural Audio Compression
  65. EncoderDecoder (Google Research ใ‹ใ‚‰) Sascha Rothe, Shashi Narayan, Aliaksei Severyn ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
  66. ERNIE (Baidu ใ‹ใ‚‰) Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ERNIE: Enhanced Representation through Knowledge Integration
  67. ErnieM (Baidu ใ‹ใ‚‰) Xuan Ouyang, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora
  68. ESM (Meta AI ใ‹ใ‚‰) ใฏใƒˆใƒฉใƒณใ‚นใƒ•ใ‚ฉใƒผใƒžใƒผใƒ—ใƒญใƒ†ใ‚คใƒณ่จ€่ชžใƒขใƒ‡ใƒซใงใ™. ESM-1b ใฏ Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. ESM-1v ใฏ Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rivesใ€€ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Language models enable zero-shot prediction of the effects of mutations on protein function. ESM-2 ใจใ€€ESMFold ใฏ Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Language models of protein sequences at the scale of evolution enable accurate structure prediction
  69. Falcon (from Technology Innovation Institute) by Almazrouei, Ebtesam and Alobeidli, Hamza and Alshamsi, Abdulaziz and Cappelli, Alessandro and Cojocaru, Ruxandra and Debbah, Merouane and Goffinet, Etienne and Heslow, Daniel and Launay, Julien and Malartic, Quentin and Noune, Badreddine and Pannier, Baptiste and Penedo, Guilherme.
  70. FastSpeech2Conformer (ESPnet and Microsoft Research ใ‹ใ‚‰) Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, and Yuekai Zhang. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Fastspeech 2: Fast And High-quality End-to-End Text To Speech
  71. FLAN-T5 (Google AI ใ‹ใ‚‰) Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸใƒฌใƒใ‚ธใƒˆใƒชใƒผ google-research/t5x Le, and Jason Wei
  72. FLAN-UL2 (from Google AI) released in the repository google-research/t5x by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei
  73. FlauBERT (CNRS ใ‹ใ‚‰) Hang Le, Loรฏc Vial, Jibril Frej, Vincent Segonne, Maximin Coavoux, Benjamin Lecouteux, Alexandre Allauzen, Benoรฎt Crabbรฉ, Laurent Besacier, Didier Schwab ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: FlauBERT: Unsupervised Language Model Pre-training for French
  74. FLAVA (Facebook AI ใ‹ใ‚‰) Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, and Douwe Kiela ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: FLAVA: A Foundational Language And Vision Alignment Model
  75. FNet (Google Research ใ‹ใ‚‰) James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: FNet: Mixing Tokens with Fourier Transforms
  76. FocalNet (Microsoft Research ใ‹ใ‚‰) Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Focal Modulation Networks
  77. Funnel Transformer (CMU/Google Brain ใ‹ใ‚‰) Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
  78. Fuyu (ADEPT ใ‹ใ‚‰) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, SaฤŸnak TaลŸฤฑrlar. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ blog post
  79. GIT (Microsoft Research ใ‹ใ‚‰) Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ GIT: A Generative Image-to-text Transformer for Vision and Language
  80. GLPN (KAIST ใ‹ใ‚‰) Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth
  81. GPT (OpenAI ใ‹ใ‚‰) Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Improving Language Understanding by Generative Pre-Training
  82. GPT Neo (EleutherAI ใ‹ใ‚‰) Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸใƒฌใƒใ‚ธใƒˆใƒชใƒผ : EleutherAI/gpt-neo
  83. GPT NeoX (EleutherAI ใ‹ใ‚‰) Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: GPT-NeoX-20B: An Open-Source Autoregressive Language Model
  84. GPT NeoX Japanese (ABEJA ใ‹ใ‚‰) Shinya Otani, Takayoshi Makabe, Anuj Arora, and Kyo Hattori ใ‹ใ‚‰ใƒชใƒชใƒผใ‚น.
  85. GPT-2 (OpenAI ใ‹ใ‚‰) Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever** ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Language Models are Unsupervised Multitask Learners
  86. GPT-J (EleutherAI ใ‹ใ‚‰) Ben Wang and Aran Komatsuzaki ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸใƒฌใƒใ‚ธใƒˆใƒชใƒผ kingoflolz/mesh-transformer-jax
  87. GPT-Sw3 (AI-Sweden ใ‹ใ‚‰) Ariel Ekgren, Amaru Cuba Gyllensten, Evangelia Gogoulou, Alice Heiman, Severine Verlinden, Joey ร–hman, Fredrik Carlsson, Magnus Sahlgren ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish
  88. GPTBigCode (BigCode ใ‹ใ‚‰) Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo Garcรญa del Rรญo, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ SantaCoder: don't reach for the stars!
  89. GPTSAN-japanese tanreinama/GPTSAN ๅ‚ๆœฌไฟŠไน‹(tanreinama)ใ‹ใ‚‰ใƒชใƒชใƒผใ‚นใ•ใ‚Œใพใ—ใŸ.
  90. Graphormer (Microsoft ใ‹ใ‚‰) Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Do Transformers Really Perform Bad for Graph Representation?.
  91. GroupViT (UCSD, NVIDIA ใ‹ใ‚‰) Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: GroupViT: Semantic Segmentation Emerges from Text Supervision
  92. HerBERT (Allegro.pl, AGH University of Science and Technology ใ‹ใ‚‰) Piotr Rybak, Robert Mroczkowski, Janusz Tracz, Ireneusz Gawlik. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ KLEJ: Comprehensive Benchmark for Polish Language Understanding
  93. Hubert (Facebook ใ‹ใ‚‰) Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
  94. I-BERT (Berkeley ใ‹ใ‚‰) Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: I-BERT: Integer-only BERT Quantization
  95. IDEFICS (from HuggingFace) released with the paper OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents by Hugo Laurenรงon, Lucile Saulnier, Lรฉo Tronchon, Stas Bekman, Amanpreet Singh, Anton Lozhkov, Thomas Wang, Siddharth Karamcheti, Alexander M. Rush, Douwe Kiela, Matthieu Cord, Victor Sanh.
  96. ImageGPT (OpenAI ใ‹ใ‚‰) Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Generative Pretraining from Pixels
  97. Informer (from Beihang University, UC Berkeley, Rutgers University, SEDD Company) released with the paper Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting by Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang.
  98. InstructBLIP (Salesforce ใ‹ใ‚‰) Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
  99. Jukebox (OpenAI ใ‹ใ‚‰) Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Jukebox: A Generative Model for Music
  100. KOSMOS-2 (from Microsoft Research Asia) released with the paper Kosmos-2: Grounding Multimodal Large Language Models to the World by Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei.
  101. LayoutLM (Microsoft Research Asia ใ‹ใ‚‰) Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LayoutLM: Pre-training of Text and Layout for Document Image Understanding
  102. LayoutLMv2 (Microsoft Research Asia ใ‹ใ‚‰) Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
  103. LayoutLMv3 (Microsoft Research Asia ใ‹ใ‚‰) Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
  104. LayoutXLM (Microsoft Research Asia ใ‹ใ‚‰) Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
  105. LED (AllenAI ใ‹ใ‚‰) Iz Beltagy, Matthew E. Peters, Arman Cohan ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Longformer: The Long-Document Transformer
  106. LeViT (Meta AI ใ‹ใ‚‰) Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervรฉ Jรฉgou, Matthijs Douze ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LeViT: A Vision Transformer in ConvNet's Clothing for Faster Inference
  107. LiLT (South China University of Technology ใ‹ใ‚‰) Jiapeng Wang, Lianwen Jin, Kai Ding ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
  108. LLaMA (The FAIR team of Meta AI ใ‹ใ‚‰) Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothรฉe Lacroix, Baptiste Roziรจre, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ LLaMA: Open and Efficient Foundation Language Models
  109. Llama2 (The FAIR team of Meta AI ใ‹ใ‚‰) Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushka rMishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing EllenTan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom.. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Llama2: Open Foundation and Fine-Tuned Chat Models
  110. LLaVa (Microsoft Research & University of Wisconsin-Madison ใ‹ใ‚‰) Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Visual Instruction Tuning
  111. Longformer (AllenAI ใ‹ใ‚‰) Iz Beltagy, Matthew E. Peters, Arman Cohan ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Longformer: The Long-Document Transformer
  112. LongT5 (Google AI ใ‹ใ‚‰) Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LongT5: Efficient Text-To-Text Transformer for Long Sequences
  113. LUKE (Studio Ousia ใ‹ใ‚‰) Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
  114. LXMERT (UNC Chapel Hill ใ‹ใ‚‰) Hao Tan and Mohit Bansal ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: LXMERT: Learning Cross-Modality Encoder Representations from Transformers for Open-Domain Question Answering
  115. M-CTC-T (Facebook ใ‹ใ‚‰) Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Pseudo-Labeling For Massively Multilingual Speech Recognition
  116. M2M100 (Facebook ใ‹ใ‚‰) Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Beyond English-Centric Multilingual Machine Translation
  117. MADLAD-400 (from Google) released with the paper MADLAD-400: A Multilingual And Document-Level Large Audited Dataset by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
  118. MarianMT Jรถrg Tiedemann ใ‹ใ‚‰. OPUS ใ‚’ไฝฟใ„ใชใŒใ‚‰ๅญฆ็ฟ’ใ•ใ‚ŒใŸ "Machine translation" (ใƒžใ‚ทใƒณใƒˆใƒฉใƒณใ‚นใƒฌใƒผใ‚ทใƒงใƒณ) ใƒขใƒ‡ใƒซ. Marian Framework ใฏMicrosoft Translator Teamใ€€ใŒ็พๅœจ้–‹็™บไธญใงใ™.
  119. MarkupLM (Microsoft Research Asia ใ‹ใ‚‰) Junlong Li, Yiheng Xu, Lei Cui, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding
  120. Mask2Former (FAIR and UIUC ใ‹ใ‚‰) Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Masked-attention Mask Transformer for Universal Image Segmentation
  121. MaskFormer (Meta and UIUC ใ‹ใ‚‰) Bowen Cheng, Alexander G. Schwing, Alexander Kirillov ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Per-Pixel Classification is Not All You Need for Semantic Segmentation
  122. MatCha (Google AI ใ‹ใ‚‰) Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Martin Eisenschlos. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
  123. mBART (Facebook ใ‹ใ‚‰) Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Multilingual Denoising Pre-training for Neural Machine Translation
  124. mBART-50 (Facebook ใ‹ใ‚‰) Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Multilingual Translation with Extensible Multilingual Pretraining and Finetuning
  125. MEGA (Facebook ใ‹ใ‚‰) Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, and Luke Zettlemoyer. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Mega: Moving Average Equipped Gated Attention
  126. Megatron-BERT (NVIDIA ใ‹ใ‚‰) Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
  127. Megatron-GPT2 (NVIDIA ใ‹ใ‚‰) Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
  128. MGP-STR (Alibaba Research ใ‹ใ‚‰) Peng Wang, Cheng Da, and Cong Yao. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Multi-Granularity Prediction for Scene Text Recognition
  129. Mistral (from Mistral AI) by The Mistral AI team: Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lรฉlio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothรฉe Lacroix, William El Sayed..
  130. Mixtral (from Mistral AI) by The Mistral AI team: Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lรฉlio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothรฉe Lacroix, William El Sayed.
  131. mLUKE (Studio Ousia ใ‹ใ‚‰) Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models
  132. MMS (Facebook ใ‹ใ‚‰) Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Scaling Speech Technology to 1,000+ Languages
  133. MobileBERT (CMU/Google Brain ใ‹ใ‚‰) Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
  134. MobileNetV1 (Google Inc. ใ‹ใ‚‰) Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
  135. MobileNetV2 (Google Inc. ใ‹ใ‚‰) Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MobileNetV2: Inverted Residuals and Linear Bottlenecks
  136. MobileViT (Apple ใ‹ใ‚‰) Sachin Mehta and Mohammad Rastegari ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
  137. MobileViTV2 (Apple ใ‹ใ‚‰) Sachin Mehta and Mohammad Rastegari. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Separable Self-attention for Mobile Vision Transformers
  138. MPNet (Microsoft Research ใ‹ใ‚‰) Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MPNet: Masked and Permuted Pre-training for Language Understanding
  139. MPT (MosaiML ใ‹ใ‚‰) the MosaicML NLP Team. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ llm-foundry
  140. MRA (the University of Wisconsin - Madison ใ‹ใ‚‰) Zhanpeng Zeng, Sourav Pal, Jeffery Kline, Glenn M Fung, Vikas Singh. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Multi Resolution Analysis (MRA)
  141. MT5 (Google AI ใ‹ใ‚‰) Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: mT5: A massively multilingual pre-trained text-to-text transformer
  142. MusicGen (from Meta) released with the paper Simple and Controllable Music Generation by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Dรฉfossez.
  143. MVP (RUC AI Box ใ‹ใ‚‰) Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MVP: Multi-task Supervised Pre-training for Natural Language Generation
  144. NAT (SHI Labs ใ‹ใ‚‰) Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Neighborhood Attention Transformer
  145. Nezha (Huawei Noahโ€™s Ark Lab ใ‹ใ‚‰) Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: NEZHA: Neural Contextualized Representation for Chinese Language Understanding
  146. NLLB (Meta ใ‹ใ‚‰) the NLLB team ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: No Language Left Behind: Scaling Human-Centered Machine Translation
  147. NLLB-MOE (Meta ใ‹ใ‚‰) the NLLB team. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ No Language Left Behind: Scaling Human-Centered Machine Translation
  148. Nougat (Meta AI ใ‹ใ‚‰) Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Nougat: Neural Optical Understanding for Academic Documents
  149. Nystrรถmformer (the University of Wisconsin - Madison ใ‹ใ‚‰) Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Nystrรถmformer: A Nystrรถm-Based Algorithm for Approximating Self-Attention
  150. OneFormer (SHI Labs ใ‹ใ‚‰) Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: OneFormer: One Transformer to Rule Universal Image Segmentation
  151. OpenLlama (from s-JoL) released on GitHub (now removed).
  152. OPT (Meta AI ใ‹ใ‚‰) Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen et al ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: OPT: Open Pre-trained Transformer Language Models
  153. OWL-ViT (Google AI ใ‹ใ‚‰) Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Simple Open-Vocabulary Object Detection with Vision Transformers
  154. OWLv2 (Google AI ใ‹ใ‚‰) Matthias Minderer, Alexey Gritsenko, Neil Houlsby. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Scaling Open-Vocabulary Object Detection
  155. PatchTSMixer ( IBM Research ใ‹ใ‚‰) Vijay Ekambaram, Arindam Jati, Nam Nguyen, Phanwadee Sinthong, Jayant Kalagnanam. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting
  156. PatchTST (IBM ใ‹ใ‚‰) Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
  157. Pegasus (Google ใ‹ใ‚‰) Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
  158. PEGASUS-X (Google ใ‹ใ‚‰) Jason Phang, Yao Zhao, and Peter J. Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Investigating Efficiently Extending Transformers for Long Input Summarization
  159. Perceiver IO (Deepmind ใ‹ใ‚‰) Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hรฉnaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, Joรฃo Carreira ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Perceiver IO: A General Architecture for Structured Inputs & Outputs
  160. Persimmon (ADEPT ใ‹ใ‚‰) Erich Elsen, Augustus Odena, Maxwell Nye, SaฤŸnak TaลŸฤฑrlar, Tri Dao, Curtis Hawthorne, Deepak Moparthi, Arushi Somani. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ blog post
  161. Phi (from Microsoft) released with the papers - Textbooks Are All You Need by Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio Cรฉsar Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sรฉbastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee and Yuanzhi Li, Textbooks Are All You Need II: phi-1.5 technical report by Yuanzhi Li, Sรฉbastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar and Yin Tat Lee.
  162. PhoBERT (VinAI Research ใ‹ใ‚‰) Dat Quoc Nguyen and Anh Tuan Nguyen ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: PhoBERT: Pre-trained language models for Vietnamese
  163. Pix2Struct (Google ใ‹ใ‚‰) Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
  164. PLBart (UCLA NLP ใ‹ใ‚‰) Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Unified Pre-training for Program Understanding and Generation
  165. PoolFormer (Sea AI Labs ใ‹ใ‚‰) Yu, Weihao and Luo, Mi and Zhou, Pan and Si, Chenyang and Zhou, Yichen and Wang, Xinchao and Feng, Jiashi and Yan, Shuicheng ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: MetaFormer is Actually What You Need for Vision
  166. Pop2Piano released with the paper Pop2Piano : Pop Audio-based Piano Cover Generation by Jongho Choi, Kyogu Lee.
  167. ProphetNet (Microsoft Research ใ‹ใ‚‰) Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
  168. PVT (Nanjing University, The University of Hong Kong etc. ใ‹ใ‚‰) Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
  169. QDQBert (NVIDIA ใ‹ใ‚‰) Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation
  170. RAG (Facebook ใ‹ใ‚‰) Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Kรผttler, Mike Lewis, Wen-tau Yih, Tim Rocktรคschel, Sebastian Riedel, Douwe Kiela ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
  171. REALM (Google Research ใ‹ใ‚‰) Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: REALM: Retrieval-Augmented Language Model Pre-Training
  172. Reformer (Google Research ใ‹ใ‚‰) Nikita Kitaev, ลukasz Kaiser, Anselm Levskaya ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Reformer: The Efficient Transformer
  173. RegNet (META Platforms ใ‹ใ‚‰) Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollรกr ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Designing Network Design Space
  174. RemBERT (Google Research ใ‹ใ‚‰) Hyung Won Chung, Thibault Fรฉvry, Henry Tsai, M. Johnson, Sebastian Ruder ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Rethinking embedding coupling in pre-trained language models
  175. ResNet (Microsoft Research ใ‹ใ‚‰) Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Deep Residual Learning for Image Recognition
  176. RoBERTa (Facebook ใ‹ใ‚‰), Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: RoBERTa: A Robustly Optimized BERT Pretraining Approach
  177. RoBERTa-PreLayerNorm (Facebook ใ‹ใ‚‰) Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: fairseq: A Fast, Extensible Toolkit for Sequence Modeling
  178. RoCBert (WeChatAI ใ‹ใ‚‰) HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining
  179. RoFormer (ZhuiyiTechnology ใ‹ใ‚‰), Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: RoFormer: Enhanced Transformer with Rotary Position Embedding
  180. RWKV (Bo Peng ใ‹ใ‚‰) Bo Peng. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ this repo
  181. SeamlessM4T (from Meta AI) released with the paper SeamlessM4T โ€” Massively Multilingual & Multimodal Machine Translation by the Seamless Communication team.
  182. SeamlessM4Tv2 (from Meta AI) released with the paper Seamless: Multilingual Expressive and Streaming Speech Translation by the Seamless Communication team.
  183. SegFormer (NVIDIA ใ‹ใ‚‰) Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
  184. Segment Anything (Meta AI ใ‹ใ‚‰) Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Segment Anything
  185. SEW (ASAPP ใ‹ใ‚‰) Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
  186. SEW-D (ASAPP ใ‹ใ‚‰) Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
  187. SigLIP (Google AI ใ‹ใ‚‰) Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, Lucas Beyer. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Sigmoid Loss for Language Image Pre-Training
  188. SpeechT5 (Microsoft Research ใ‹ใ‚‰) Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
  189. SpeechToTextTransformer (Facebook ใ‹ใ‚‰), Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Dmytro Okhonko, Juan Pino ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: fairseq S2T: Fast Speech-to-Text Modeling with fairseq
  190. SpeechToTextTransformer2 (Facebook ใ‹ใ‚‰), Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Large-Scale Self- and Semi-Supervised Learning for Speech Translation
  191. Splinter (Tel Aviv University ใ‹ใ‚‰), Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Few-Shot Question Answering by Pretraining Span Selection
  192. SqueezeBERT (Berkeley ใ‹ใ‚‰) Forrest N. Iandola, Albert E. Shaw, Ravi Krishna, and Kurt W. Keutzer ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
  193. SwiftFormer (MBZUAI ใ‹ใ‚‰) Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
  194. Swin Transformer (Microsoft ใ‹ใ‚‰) Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
  195. Swin Transformer V2 (Microsoft ใ‹ใ‚‰) Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Swin Transformer V2: Scaling Up Capacity and Resolution
  196. Swin2SR (University of Wรผrzburg ใ‹ใ‚‰) Marcos V. Conde, Ui-Jin Choi, Maxime Burchi, Radu Timofte ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration
  197. SwitchTransformers (Google ใ‹ใ‚‰) William Fedus, Barret Zoph, Noam Shazeer ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
  198. T5 (Google AI ใ‹ใ‚‰) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
  199. T5v1.1 (Google AI ใ‹ใ‚‰) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸใƒฌใƒใ‚ธใƒˆใƒชใƒผ google-research/text-to-text-transfer-transformer
  200. Table Transformer (Microsoft Research ใ‹ใ‚‰) Brandon Smock, Rohith Pesala, Robin Abraham ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documents
  201. TAPAS (Google AI ใ‹ใ‚‰) Jonathan Herzig, Paweล‚ Krzysztof Nowak, Thomas Mรผller, Francesco Piccinno and Julian Martin Eisenschlos ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: TAPAS: Weakly Supervised Table Parsing via Pre-training
  202. TAPEX (Microsoft Research ใ‹ใ‚‰) Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: TAPEX: Table Pre-training via Learning a Neural SQL Executor
  203. Time Series Transformer (HuggingFace ใ‹ใ‚‰).
  204. TimeSformer (Facebook ใ‹ใ‚‰) Gedas Bertasius, Heng Wang, Lorenzo Torresani ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Is Space-Time Attention All You Need for Video Understanding?
  205. Trajectory Transformer (the University of California at Berkeley ใ‹ใ‚‰) Michael Janner, Qiyang Li, Sergey Levine ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Offline Reinforcement Learning as One Big Sequence Modeling Problem
  206. Transformer-XL (Google/CMU ใ‹ใ‚‰) Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
  207. TrOCR (Microsoft ใ‹ใ‚‰), Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
  208. TVLT (from UNC Chapel Hill ใ‹ใ‚‰), Zineng Tang, Jaemin Cho, Yixin Nie, Mohit Bansal ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: TVLT: Textless Vision-Language Transformer
  209. TVP (Intel ใ‹ใ‚‰), Yimeng Zhang, Xin Chen, Jinghan Jia, Sijia Liu, Ke Ding ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Text-Visual Prompting for Efficient 2D Temporal Video Grounding
  210. UL2 (Google Research ใ‹ใ‚‰) Yi Tay, Mostafa Dehghani, Vinh Q ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Unifying Language Learning Paradigms Tran, Xavier Garcia, Dara Bahri, Tal Schuster, Huaixiu Steven Zheng, Neil Houlsby, Donald Metzler
  211. UMT5 (Google Research ใ‹ใ‚‰) Hyung Won Chung, Xavier Garcia, Adam Roberts, Yi Tay, Orhan Firat, Sharan Narang, Noah Constant. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining
  212. UniSpeech (Microsoft Research ใ‹ใ‚‰) Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
  213. UniSpeechSat (Microsoft Research ใ‹ใ‚‰) Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING
  214. UnivNet (from Kakao Corporation) released with the paper UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation by Won Jang, Dan Lim, Jaesam Yoon, Bongwan Kim, and Juntae Kim.
  215. UPerNet (Peking University ใ‹ใ‚‰) Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, Jian Sun. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Unified Perceptual Parsing for Scene Understanding
  216. VAN (Tsinghua University and Nankai University ใ‹ใ‚‰) Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Visual Attention Network
  217. VideoMAE (Multimedia Computing Group, Nanjing University ใ‹ใ‚‰) Zhan Tong, Yibing Song, Jue Wang, Limin Wang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
  218. ViLT (NAVER AI Lab/Kakao Enterprise/Kakao Brain ใ‹ใ‚‰) Wonjae Kim, Bokyung Son, Ildoo Kim ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
  219. VipLlava (University of Wisconsinโ€“Madison ใ‹ใ‚‰) Mu Cai, Haotian Liu, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai, Dennis Park, Yong Jae Lee. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Making Large Multimodal Models Understand Arbitrary Visual Prompts
  220. Vision Transformer (ViT) (Google AI ใ‹ใ‚‰) Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
  221. VisualBERT (UCLA NLP ใ‹ใ‚‰) Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: VisualBERT: A Simple and Performant Baseline for Vision and Language
  222. ViT Hybrid (Google AI ใ‹ใ‚‰) Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
  223. VitDet (Meta AI ใ‹ใ‚‰) Yanghao Li, Hanzi Mao, Ross Girshick, Kaiming He. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Exploring Plain Vision Transformer Backbones for Object Detection
  224. ViTMAE (Meta AI ใ‹ใ‚‰) Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollรกr, Ross Girshick ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Masked Autoencoders Are Scalable Vision Learners
  225. ViTMatte (HUST-VL ใ‹ใ‚‰) Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers
  226. ViTMSN (Meta AI ใ‹ใ‚‰) Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Michael Rabbat, Nicolas Ballas ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Masked Siamese Networks for Label-Efficient Learning
  227. VITS (Kakao Enterprise ใ‹ใ‚‰) Jaehyeon Kim, Jungil Kong, Juhee Son. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
  228. ViViT (from Google Research) released with the paper ViViT: A Video Vision Transformer by Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Luฤiฤ‡, Cordelia Schmid.
  229. Wav2Vec2 (Facebook AI ใ‹ใ‚‰) Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
  230. Wav2Vec2-Conformer (Facebook AI ใ‹ใ‚‰) Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, Juan Pino ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: FAIRSEQ S2T: Fast Speech-to-Text Modeling with FAIRSEQ
  231. Wav2Vec2Phoneme (Facebook AI ใ‹ใ‚‰) Qiantong Xu, Alexei Baevski, Michael Auli ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
  232. WavLM (Microsoft Research ใ‹ใ‚‰) Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
  233. Whisper (OpenAI ใ‹ใ‚‰) Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Robust Speech Recognition via Large-Scale Weak Supervision
  234. X-CLIP (Microsoft Research ใ‹ใ‚‰) Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, Haibin Ling ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Expanding Language-Image Pretrained Models for General Video Recognition
  235. X-MOD (Meta AI ใ‹ใ‚‰) Jonas Pfeiffer, Naman Goyal, Xi Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe. ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡ Lifting the Curse of Multilinguality by Pre-training Modular Transformers
  236. XGLM (From Facebook AI) Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Few-shot Learning with Multilingual Language Models
  237. XLM (Facebook ใ‹ใ‚‰) Guillaume Lample and Alexis Conneau ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Cross-lingual Language Model Pretraining
  238. XLM-ProphetNet (Microsoft Research ใ‹ใ‚‰) Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
  239. XLM-RoBERTa (Facebook AI ใ‹ใ‚‰), Alexis Conneau*, Kartikay Khandelwal*, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmรกn, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Unsupervised Cross-lingual Representation Learning at Scale
  240. XLM-RoBERTa-XL (Facebook AI ใ‹ใ‚‰), Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Larger-Scale Transformers for Multilingual Masked Language Modeling
  241. XLM-V (Meta AI ใ‹ใ‚‰) Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models
  242. XLNet (Google/CMU ใ‹ใ‚‰) Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: โ€‹XLNet: Generalized Autoregressive Pretraining for Language Understanding
  243. XLS-R (Facebook AI ใ‹ใ‚‰) Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
  244. XLSR-Wav2Vec2 (Facebook AI ใ‹ใ‚‰) Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: Unsupervised Cross-Lingual Representation Learning For Speech Recognition
  245. YOLOS (Huazhong University of Science & Technology ใ‹ใ‚‰) Yuxin Fang, Bencheng Liao, Xinggang Wang, Jiemin Fang, Jiyang Qi, Rui Wu, Jianwei Niu, Wenyu Liu ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection
  246. YOSO (the University of Wisconsin - Madison ใ‹ใ‚‰) Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh ใ‹ใ‚‰ๅ…ฌ้–‹ใ•ใ‚ŒใŸ็ ”็ฉถ่ซ–ๆ–‡: You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling
  247. ๆ–ฐใ—ใ„ใƒขใƒ‡ใƒซใ‚’ๆŠ•็จฟใ—ใŸใ„ใงใ™ใ‹๏ผŸๆ–ฐใ—ใ„ใƒขใƒ‡ใƒซใ‚’่ฟฝๅŠ ใ™ใ‚‹ใŸใ‚ใฎใ‚ฌใ‚คใƒ‰ใจใ—ใฆใ€่ฉณ็ดฐใชใ‚ฌใ‚คใƒ‰ใจใƒ†ใƒณใƒ—ใƒฌใƒผใƒˆใŒ่ฟฝๅŠ ใ•ใ‚Œใพใ—ใŸใ€‚ใ“ใ‚Œใ‚‰ใฏใƒชใƒใ‚ธใƒˆใƒชใฎtemplatesใƒ•ใ‚ฉใƒซใƒ€ใซใ‚ใ‚Šใพใ™ใ€‚PRใ‚’ๅง‹ใ‚ใ‚‹ๅ‰ใซใ€ๅฟ…ใšใ‚ณใƒณใƒˆใƒชใƒ“ใƒฅใƒผใ‚ทใƒงใƒณใ‚ฌใ‚คใƒ‰ใ‚’็ขบ่ชใ—ใ€ใƒกใƒณใƒ†ใƒŠใซ้€ฃ็ตกใ™ใ‚‹ใ‹ใ€ใƒ•ใ‚ฃใƒผใƒ‰ใƒใƒƒใ‚ฏใ‚’ๅŽ้›†ใ™ใ‚‹ใŸใ‚ใซissueใ‚’้–‹ใ„ใฆใใ ใ•ใ„ใ€‚

ๅ„ใƒขใƒ‡ใƒซใŒFlaxใ€PyTorchใ€TensorFlowใงๅฎŸ่ฃ…ใ•ใ‚Œใฆใ„ใ‚‹ใ‹ใ€๐Ÿค—Tokenizersใƒฉใ‚คใƒ–ใƒฉใƒชใซๆ”ฏใˆใ‚‰ใ‚ŒใŸ้–ข้€ฃใƒˆใƒผใ‚ฏใƒŠใ‚คใ‚ถใ‚’ๆŒใฃใฆใ„ใ‚‹ใ‹ใฏใ€ใ“ใฎ่กจใ‚’ๅ‚็…งใ—ใฆใใ ใ•ใ„ใ€‚

ใ“ใ‚Œใ‚‰ใฎๅฎŸ่ฃ…ใฏใ„ใใคใ‹ใฎใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใงใƒ†ใ‚นใƒˆใ•ใ‚ŒใฆใŠใ‚Š(ใ‚ตใƒณใƒ—ใƒซใ‚นใ‚ฏใƒชใƒ—ใƒˆใ‚’ๅ‚็…ง)ใ€ใ‚ชใƒชใ‚ธใƒŠใƒซใฎๅฎŸ่ฃ…ใฎๆ€ง่ƒฝใจไธ€่‡ดใ™ใ‚‹ใฏใšใงใ‚ใ‚‹ใ€‚ๆ€ง่ƒฝใฎ่ฉณ็ดฐใฏdocumentationใฎExamplesใ‚ปใ‚ฏใ‚ทใƒงใƒณใง่ฆ‹ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚

ใ•ใ‚‰ใซ่ฉณใ—ใ

ใ‚ปใ‚ฏใ‚ทใƒงใƒณ ๆฆ‚่ฆ
ใƒ‰ใ‚ญใƒฅใƒกใƒณใƒˆ ๅฎŒๅ…จใชAPIใƒ‰ใ‚ญใƒฅใƒกใƒณใƒˆใจใƒใƒฅใƒผใƒˆใƒชใ‚ขใƒซ
ใ‚ฟใ‚นใ‚ฏๆฆ‚่ฆ ๐Ÿค—TransformersใŒใ‚ตใƒใƒผใƒˆใ™ใ‚‹ใ‚ฟใ‚นใ‚ฏ
ๅ‰ๅ‡ฆ็†ใƒใƒฅใƒผใƒˆใƒชใ‚ขใƒซ ใƒขใƒ‡ใƒซ็”จใฎใƒ‡ใƒผใ‚ฟใ‚’ๆบ–ๅ‚™ใ™ใ‚‹ใŸใ‚ใซTokenizerใ‚ฏใƒฉใ‚นใ‚’ไฝฟ็”จ
ใƒˆใƒฌใƒผใƒ‹ใƒณใ‚ฐใจๅพฎ่ชฟๆ•ด PyTorch/TensorFlowใฎๅญฆ็ฟ’ใƒซใƒผใƒ—ใจTrainerAPIใง๐Ÿค—TransformersใŒๆไพ›ใ™ใ‚‹ใƒขใƒ‡ใƒซใ‚’ไฝฟ็”จ
ใ‚ฏใ‚คใƒƒใ‚ฏใƒ„ใ‚ขใƒผ: ๅพฎ่ชฟๆ•ด/ไฝฟ็”จๆ–นๆณ•ใ‚นใ‚ฏใƒชใƒ—ใƒˆ ๆง˜ใ€…ใชใ‚ฟใ‚นใ‚ฏใงใƒขใƒ‡ใƒซใฎๅพฎ่ชฟๆ•ดใ‚’่กŒใ†ใŸใ‚ใฎใ‚นใ‚ฏใƒชใƒ—ใƒˆไพ‹
ใƒขใƒ‡ใƒซใฎๅ…ฑๆœ‰ใจใ‚ขใƒƒใƒ—ใƒญใƒผใƒ‰ ๅพฎ่ชฟๆ•ดใ—ใŸใƒขใƒ‡ใƒซใ‚’ใ‚ขใƒƒใƒ—ใƒญใƒผใƒ‰ใ—ใฆใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใงๅ…ฑๆœ‰ใ™ใ‚‹
ใƒžใ‚คใ‚ฐใƒฌใƒผใ‚ทใƒงใƒณ pytorch-transformersใพใŸใฏpytorch-pretrained-bertใ‹ใ‚‰๐Ÿค—Transformers ใซ็งป่กŒใ™ใ‚‹

ๅผ•็”จ

๐Ÿค— ใƒˆใƒฉใƒณใ‚นใƒ•ใ‚ฉใƒผใƒžใƒผใƒฉใ‚คใƒ–ใƒฉใƒชใซๅผ•็”จใงใใ‚‹่ซ–ๆ–‡ใŒๅ‡บๆฅใพใ—ใŸ:

@inproceedings{wolf-etal-2020-transformers,
    title = "Transformers: State-of-the-Art Natural Language Processing",
    author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rรฉmi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = oct,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
    pages = "38--45"
}