Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 2.0 RC2 Start Harvester Fails #15910

Closed
BrandtH22 opened this issue Aug 1, 2023 · 3 comments
Closed

[Bug] 2.0 RC2 Start Harvester Fails #15910

BrandtH22 opened this issue Aug 1, 2023 · 3 comments
Assignees
Labels
bug Something isn't working compression Related to compressed plotting/farming

Comments

@BrandtH22
Copy link
Contributor

BrandtH22 commented Aug 1, 2023

What happened?

Bug reported by Daryl in Discord (https://discord.com/channels/1034523881404370984/1099818017908592660/1135638423118549083)

Updated from the first 2.0 alpha release to the 2.0 rc2 release, but it keeps throwing an error when I try to start the harvester.

image

Attempted resolutions:

  • Deleted config and started fresh
  • enabled gpu decompression
  • set parallel decompressor threads to 1.
  • Threads were at 8, tried to bump them down to 1, no such luck
  • updated and upgraded all my packages, then restarted to make sure something wasn't hanging on to the GPU or something. Just won't start the harvester
  • Only way to get around the error was to rename it back to decompresser (from decompressor), but then GPU farming wasn't working

More info:
Nvidia Driver: 530.30.02
CUDA Version: 12.1
Ubuntu desktop, 40 cores with 256GB of RAM, RTX a4000

Steps to reproduce:

  • Close gui
  • Chia stop all -d
  • backup config file to desktop
  • delete config file
  • sudo dpkg --install chia-blockchain_2.0.0rc2_amd64.deb
  • chia init
  • set parallel_decompressor_count to 1
  • add plot directories
  • switch use_gpu_harvesting to true
  • run chia start harvester

Config file harvester section:

harvester:
  chia_ssl_ca:
    crt: config/ssl/ca/chia_ca.crt
    key: config/ssl/ca/chia_ca.key
  decompressor_thread_count: 0
  disable_cpu_affinity: false
  enforce_gpu_index: false
  farmer_peer:
    host: localhost
    port: 8447
  gpu_index: 0
  logging: *id001
  max_compression_level_allowed: 7
  network_overrides: *id002
  num_threads: 30
  parallel_decompressor_count: 1
  parallel_read: true
  plot_directories:
  - /mnt/j301
  plots_refresh_parameter:
    batch_size: 300
    batch_sleep_milliseconds: 1
    interval_seconds: 120
    retry_invalid_seconds: 1200
  port: 8448
  private_ssl_ca:
    crt: config/ssl/ca/private_ca.crt
    key: config/ssl/ca/private_ca.key
  recursive_plot_scan: false
  rpc_port: 8560
  selected_network: mainnet
  ssl:
    private_crt: config/ssl/harvester/private_harvester.crt
    private_key: config/ssl/harvester/private_harvester.key
  start_rpc_server: true
  use_gpu_harvesting: true

Version

2.0 RC2

What platform are you using?

Linux

What ui mode are you using?

CLI

Relevant log output

No response

@BrandtH22 BrandtH22 added the bug Something isn't working label Aug 1, 2023
@emlowe emlowe added the compression Related to compressed plotting/farming label Aug 2, 2023
@jmhands
Copy link

jmhands commented Aug 2, 2023

I reproduced on all 3 of my machines on Ubuntu server (22.04 & 23) with GPU harvesting enabled.

  File "chia/server/start_harvester.py", line 78, in <module>
  File "chia/server/start_harvester.py", line 74, in main
  File "chia/server/start_service.py", line 319, in async_run
  File "asyncio/runners.py", line 44, in run
  File "asyncio/base_events.py", line 616, in run_until_complete
  File "chia/server/start_harvester.py", line 66, in async_main
  File "chia/server/start_harvester.py", line 37, in create_harvester_service
  File "chia/harvester/harvester.py", line 108, in __init__
  File "chia/plotting/manager.py", line 106, in configure_decompressor
RuntimeError: Failed to preallocate memory for contexts with result 2
[3983716] Failed to execute script 'start_harvester' due to unhandled exception!

workaround is to reinstall nvidia driver and reboot system, then with no change to config.yaml the harvester was able to start

@wallentx
Copy link
Contributor

wallentx commented Aug 7, 2023

We should try this with rc3, since we've added all architectures

@jmhands
Copy link

jmhands commented Aug 21, 2023

this has been fixed now, rc6

@jmhands jmhands closed this as completed Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compression Related to compressed plotting/farming
Projects
None yet
Development

No branches or pull requests

5 participants