-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway"
2025-03-19 17:04:19,355 - main - INFO - Attempt 47: Unexpected status code 502
INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway"
2025-03-19 17:04:20,411 - main - INFO - Attempt 48: Unexpected status code 502
when i run
python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/horribleocr.pdf
the detail
(olmocr) aiml@aiml35:~/olmocr$ python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/horribleocr.pdf INFO:olmocr.check:pdftoppm is installed and working. 2025-03-19 17:03:29,762 - __main__ - INFO - Got --pdfs argument, going to add to the work queue 2025-03-19 17:03:29,762 - __main__ - INFO - Loading file at tests/gnarly_pdfs/horribleocr.pdf as PDF document 2025-03-19 17:03:29,762 - __main__ - INFO - Found 1 total pdf paths to add Sampling PDFs to calculate optimal length: 100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 492.06it/s] 2025-03-19 17:03:29,765 - __main__ - INFO - Calculated items_per_group: 500 based on average pages per PDF: 1.00 INFO:olmocr.work_queue:Found 1 total paths INFO:olmocr.work_queue:0 new paths to add to the workspace 2025-03-19 17:03:29,872 - __main__ - INFO - Starting pipeline with PID 373833 2025-03-19 17:03:29,872 - __main__ - INFO - Downloading model 'allenai/olmOCR-7B-0225-preview' Fetching 15 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 88737.04it/s] 2025-03-19 17:03:30,318 - __main__ - INFO - Model download complete 'allenai/olmOCR-7B-0225-preview' INFO:olmocr.work_queue:Initialized local queue with 1 work items INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:30,415 - __main__ - INFO - Attempt 1: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:31,453 - __main__ - INFO - Attempt 2: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:32,511 - __main__ - INFO - Attempt 3: Unexpected status code 502 2025-03-19 17:03:32,825 - __main__ - INFO - WARNING 03-19 17:03:32 cuda.py:81] Detected different devices in the system: 2025-03-19 17:03:32,825 - __main__ - INFO - WARNING 03-19 17:03:32 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:32,825 - __main__ - INFO - WARNING 03-19 17:03:32 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:32,826 - __main__ - INFO - WARNING 03-19 17:03:32 cuda.py:81] NVIDIA RTX A5000 2025-03-19 17:03:32,826 - __main__ - INFO - WARNING 03-19 17:03:32 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:32,826 - __main__ - INFO - WARNING 03-19 17:03:32 cuda.py:81] Please make sure to set CUDA_DEVICE_ORDER=PCI_BUS_IDto avoid unexpected behavior. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:33,571 - __main__ - INFO - Attempt 4: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:34,645 - __main__ - INFO - Attempt 5: Unexpected status code 502 2025-03-19 17:03:34,858 - __main__ - INFO - [2025-03-19 17:03:34] server_args=ServerArgs(model_path='allenai/olmOCR-7B-0225-preview', tokenizer_path='allenai/olmOCR-7B-0225-preview', tokenizer_mode='auto', load_format='auto', trust_remote_code=False, dtype='auto', kv_cache_dtype='auto', quantization_param_path=None, quantization=None, context_length=None, device='cuda', served_model_name='allenai/olmOCR-7B-0225-preview', chat_template='qwen2-vl', is_embedding=False, revision=None, skip_tokenizer_init=False, host='127.0.0.1', port=30024, mem_fraction_static=0.8, max_running_requests=None, max_total_tokens=None, chunked_prefill_size=2048, max_prefill_tokens=16384, schedule_policy='lpm', schedule_conservativeness=1.0, cpu_offload_gb=0, prefill_only_one_req=False, tp_size=1, stream_interval=1, stream_output=False, random_seed=662447808, constrained_json_whitespace_pattern=None, watchdog_timeout=300, download_dir=None, base_gpu_id=0, log_level='info', log_level_http='warning', log_requests=False, show_time_cost=False, enable_metrics=False, decode_log_interval=40, api_key=None, file_storage_pth='sglang_storage', enable_cache_report=False, dp_size=1, load_balance_method='round_robin', ep_size=1, dist_init_addr=None, nnodes=1, node_rank=0, json_model_override_args='{}', lora_paths=None, max_loras_per_batch=8, attention_backend='flashinfer', sampling_backend='flashinfer', grammar_backend='outlines', speculative_draft_model_path=None, speculative_algorithm=None, speculative_num_steps=5, speculative_num_draft_tokens=64, speculative_eagle_topk=8, enable_double_sparsity=False, ds_channel_config_path=None, ds_heavy_channel_num=32, ds_heavy_token_num=256, ds_heavy_channel_type='qk', ds_sparse_decode_threshold=4096, disable_radix_cache=False, disable_jump_forward=False, disable_cuda_graph=False, disable_cuda_graph_padding=False, disable_outlines_disk_cache=False, disable_custom_all_reduce=False, disable_mla=False, disable_overlap_schedule=False, enable_mixed_chunk=False, enable_dp_attention=False, enable_ep_moe=False, enable_torch_compile=False, torch_compile_max_bs=32, cuda_graph_max_bs=8, cuda_graph_bs=None, torchao_config='', enable_nan_detection=False, enable_p2p_check=False, triton_attention_reduce_in_fp32=False, triton_attention_num_kv_splits=8, num_continuous_decode_steps=1, delete_ckpt_after_loading=False, enable_memory_saver=False, allow_auto_truncate=False, enable_custom_logit_processor=False, tool_call_parser=None) INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:35,717 - __main__ - INFO - Attempt 6: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:36,789 - __main__ - INFO - Attempt 7: Unexpected status code 502 2025-03-19 17:03:37,513 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] Detected different devices in the system: 2025-03-19 17:03:37,513 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:37,513 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:37,513 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A5000 2025-03-19 17:03:37,513 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:37,513 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] Please make sure to setCUDA_DEVICE_ORDER=PCI_BUS_IDto avoid unexpected behavior. 2025-03-19 17:03:37,517 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] Detected different devices in the system: 2025-03-19 17:03:37,517 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:37,517 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:37,517 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A5000 2025-03-19 17:03:37,517 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] NVIDIA RTX A6000 2025-03-19 17:03:37,517 - __main__ - INFO - WARNING 03-19 17:03:37 cuda.py:81] Please make sure to setCUDA_DEVICE_ORDER=PCI_BUS_IDto avoid unexpected behavior. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:37,826 - __main__ - INFO - Attempt 8: Unexpected status code 502 2025-03-19 17:03:38,077 - __main__ - INFO - [2025-03-19 17:03:38] Use chat template for the OpenAI-compatible API server: qwen2-vl INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:38,866 - __main__ - INFO - Attempt 9: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:39,902 - __main__ - INFO - Attempt 10: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:40,962 - __main__ - INFO - Attempt 11: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:42,034 - __main__ - INFO - Attempt 12: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:43,105 - __main__ - INFO - Attempt 13: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:44,176 - __main__ - INFO - Attempt 14: Unexpected status code 502 2025-03-19 17:03:44,656 - __main__ - INFO - [2025-03-19 17:03:44 TP0] Overlap scheduler is disabled for multimodal models. 2025-03-19 17:03:45,245 - __main__ - INFO - [2025-03-19 17:03:45 TP0] Automatically reduce --mem-fraction-static to 0.760 because this is a multimodal model. 2025-03-19 17:03:45,245 - __main__ - INFO - [2025-03-19 17:03:45 TP0] Automatically turn off --chunked-prefill-size and disable radix cache for qwen2-vl. 2025-03-19 17:03:45,245 - __main__ - INFO - [2025-03-19 17:03:45 TP0] Init torch distributed begin. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:45,247 - __main__ - INFO - Attempt 15: Unexpected status code 502 2025-03-19 17:03:45,410 - __main__ - INFO - [2025-03-19 17:03:45 TP0] Load weight begin. avail mem=47.27 GB 2025-03-19 17:03:46,316 - __main__ - INFO - [2025-03-19 17:03:46 TP0] Using model weights format ['*.safetensors'] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:46,320 - __main__ - INFO - Attempt 16: Unexpected status code 502 Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s] Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:00<00:00, 3.89it/s] Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:00<00:00, 5.45it/s] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:47,399 - __main__ - INFO - Attempt 17: Unexpected status code 502 Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:00<00:00, 4.68it/s] Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:00<00:00, 4.74it/s] 2025-03-19 17:03:47,656 - __main__ - INFO - 2025-03-19 17:03:47,662 - __main__ - INFO - [2025-03-19 17:03:47 TP0] Load weight end. type=Qwen2VLForConditionalGeneration, dtype=torch.bfloat16, avail mem=31.56 GB 2025-03-19 17:03:47,673 - __main__ - INFO - [2025-03-19 17:03:47 TP0] KV Cache is allocated. K size: 10.11 GB, V size: 10.11 GB. 2025-03-19 17:03:47,673 - __main__ - INFO - [2025-03-19 17:03:47 TP0] Memory pool end. avail mem=10.83 GB 2025-03-19 17:03:47,813 - __main__ - INFO - [2025-03-19 17:03:47 TP0] Capture cuda graph begin. This can take up to several minutes. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:48,469 - __main__ - INFO - Attempt 18: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:49,540 - __main__ - INFO - Attempt 19: Unexpected status code 502 100%|██████████| 4/4 [00:01<00:00, 2.29it/s] 2025-03-19 17:03:49,566 - __main__ - INFO - [2025-03-19 17:03:49 TP0] Capture cuda graph end. Time elapsed: 1.75 s INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:50,612 - __main__ - INFO - Attempt 20: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:51,682 - __main__ - INFO - Attempt 21: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:52,754 - __main__ - INFO - Attempt 22: Unexpected status code 502 2025-03-19 17:03:53,347 - __main__ - INFO - [2025-03-19 17:03:53 TP0] max_total_num_tokens=378594, chunked_prefill_size=-1, max_prefill_tokens=16384, max_running_requests=4097, context_len=32768 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:53,825 - __main__ - INFO - Attempt 23: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-03-19 17:03:54,896 - __main__ - INFO - Attempt 24: Unexpected status code 502