Skip to content

issues Search Results · repo:myshell-ai/JetMoE language:Python

Filter by

10 results
 (65 ms)

10 results

inmyshell-ai/JetMoE (press backspace or delete to remove)

Thanks for the good work. Recently I evaluated JetMoE with GSM8K and Humaneval, and I found that the scores I got were higher than you reported. So I want to tell you that the GSM8K I got is 39.4, and ...
  • FFFzy
  • Opened 
    on Dec 13, 2024
  • #13

Could you please provide a mapping between the parameters of the jetmoe model (model.layes.{}....) and the parameters of the Hugging Face GPT model (transformer.h.{}..... ) ? I am very interested in using ...
  • takgto
  • Opened 
    on Jun 10, 2024
  • #11

Really a great work. And could you show us the instruction template for better using the chatbot?
  • ZiangWu-77
  • Opened 
    on Apr 26, 2024
  • #10

Thanks a lot for the great work. It would immensely benefit the community if the training script (exact script used for orchestrating the training of the JetMoE-8B model) is made public. -- Thanks.
  • spookyQubit
  • 2
  • Opened 
    on Apr 15, 2024
  • #9

from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig, AutoModelForSequenceClassification from jetmoe import JetMoEForCausalLM, JetMoEConfig, JetMoEForSequenceClassification AutoConfig.register( ...
  • laoshaw
  • Opened 
    on Apr 12, 2024
  • #8

Hi! What was the intention behind not utilizing the exact mixtral arch? Thank you for your work
  • SinanAkkoyun
  • 1
  • Opened 
    on Apr 12, 2024
  • #7

Will the pretraining datasets and corresponding code be open-sourced? Thanks!
  • hitalex
  • 5
  • Opened 
    on Apr 10, 2024
  • #6

jetmoe-8b model runs fine but for jetmoe-8b-chat with even the latest transformers and tokenizer I get: Traceback (most recent call last): File /home/cqrl/.local/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py ...
  • Sukii
  • 7
  • Opened 
    on Apr 8, 2024
  • #4

Hello, Great work by you guys. How to carry out the Finetuning what will be the GPU requirement. Thanks
  • ajinkya123-robo
  • 1
  • Opened 
    on Apr 8, 2024
  • #3

What is the minimum V100 or A800 numbers we need to train this model if not considering the training time ?
  • ybdesire
  • Opened 
    on Apr 5, 2024
  • #1
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub