Skip to content

issues Search Results · repo:AI-Hypercomputer/jetstream-pytorch language:Python

Filter by

14 results
 (90 ms)

14 results

inAI-Hypercomputer/jetstream-pytorch (press backspace or delete to remove)

Currently sampling params such as temperature are set as commandline flags in when the server starts. It would be nice for each request to pass in the sampling params instead.
  • qihqi
  • 3
  • Opened 
    on Sep 24, 2024
  • #185

Recently we added a new cli jpt (https://github.com/google/jetstream-pytorch/pull/178) that massively simplified the command line args the user need to specify. However, there are other commandline args ...
  • qihqi
  • Opened 
    on Sep 10, 2024
  • #182

As reported by @tengomucho Currently there are a few issues with prefill / generate implemention: 1. Prefill does not use self._sample to do sampling. 2. Prefill returns a token, so first time generate ...
  • qihqi
  • Opened 
    on Aug 21, 2024
  • #173

I m receiving an error when attempting to run: ray job submit -- python run_ray_serve_interleave.py --tpu_chips=4 --num_hosts=1 --size=8B --model_name=llama-3 --batch_size=8 --max_cache_length=2048 --tokenizer_path=$tokenizer_path ...
  • ryanaoleary
  • Opened 
    on Aug 7, 2024
  • #169

Prefill_ray() now returns a [result, first_token] tuple, where first_token contains a Jax array. This will cause a crash when attempting to fetch the Ray results remotely: job_id:06000000 :actor_name:ServeReplica:default:JetStreamDeployment ...
  • richardsliu
  • 1
  • Opened 
    on Jul 16, 2024
  • #150

Sending multiple prompts to the server, only the first prompt is able to return any results. Requests after the first one would only return an empty response. I ve tried 3 different ways to bring up the ...
  • richardsliu
  • 2
  • Opened 
    on Jun 27, 2024
  • #137

The checkpoint conversion script breaks for https://huggingface.co/meta-llama/Llama-2-7b, because it does not have safetensor files. But when running the script, we set --from_hf=True since the checkpoint ...
  • vivianrwu
  • Opened 
    on Jun 24, 2024
  • #135

I get this error Loading checkpoint files from /home/yeandy/llama/llama-2-13b. Loading checkpoints takes 9.128946957000153 seconds Starting to merge weights. Merging weights across 2 shards (shape = torch.Size([32000, ...
  • yeandy
  • 2
  • Opened 
    on Jun 4, 2024
  • #115

Right now, ray engine return interleave engine and a tuple separately. In the end, we would like to return a stable Tuple list for both of them.
  • FanhaiLu1
  • Opened 
    on May 29, 2024
  • #107
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub