Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB FPGA #3

Open
snoby opened this issue Jul 13, 2022 · 7 comments
Open

USB FPGA #3

snoby opened this issue Jul 13, 2022 · 7 comments

Comments

@snoby
Copy link

snoby commented Jul 13, 2022

Everything works great with the exporter, except I found 1 corner case. On a mining rig that has some gpu's, like AMD, and one of those USB FPGA cards ( the C1100) the initialization gets all messed up. Because there is no entry for the FPGA in /run/hive/gpu-detect.json, but there is a miner entry for it ( teamredminer)

I've put some debug in and some try except blocks, but not sure how to work around this.

in Class Miner:

log.debug(' return is: %s ',self.stats['bus_numbers'][0])
will say 0. not None, maybe because it's initialized based on the AMD cards that are already there...

2022-07-13 10:06:01,995 - DEBUG - Reading HiveOS configuration from /hive-config/rig.conf
2022-07-13 10:06:01,996 - INFO - Starting HTTP server on port 10101
2022-07-13 10:06:01,998 - DEBUG - Reading GPU details from /run/hive/gpu-detect.json
2022-07-13 10:06:01,998 - DEBUG - setting up GPU {'busid': '03:00.0', 'name': 'Radeon RX 6600 XT', 'brand': 'amd', 'subvendor': 'XFX', 'vbios': '113-123XT145W201222', 'mem': '8176 MB', 'mem_type': 'Samsung GDDR6'}
2022-07-13 10:06:01,998 - DEBUG - setting up GPU {'busid': '07:00.0', 'name': 'Radeon RX 6600 XT', 'brand': 'amd', 'subvendor': 'ASUS', 'vbios': '115-D532BP0-100', 'mem': '8176 MB', 'mem_type': 'Samsung GDDR6'}
2022-07-13 10:06:01,999 - DEBUG - setting up GPU {'busid': '0a:00.0', 'name': 'Radeon RX 6600 XT', 'brand': 'amd', 'subvendor': 'PowerColor', 'vbios': '113-D532XT-D05', 'mem': '8176 MB', 'mem_type': 'Samsung GDDR6'}
2022-07-13 10:06:01,999 - DEBUG - setting up GPU {'busid': '0d:00.0', 'name': 'Radeon RX 6800', 'brand': 'amd', 'subvendor': 'PowerColor', 'vbios': '111', 'mem': '16368 MB', 'mem_type': 'Samsung GDDR6'}
2022-07-13 10:06:01,999 - DEBUG - Reading statistics from /run/hive/last_stat.json
2022-07-13 10:06:01,999 - DEBUG - Adding Miner - teamredminer
2022-07-13 10:06:01,999 - DEBUG - self stats : name is: teamredminer
2022-07-13 10:06:01,999 - DEBUG -  stats {}
2022-07-13 10:06:01,999 - DEBUG -   return is: 0
2022-07-13 10:06:01,999 - DEBUG - Miner is a gpu miner:
Traceback (most recent call last):
  File "hiveos-exporter.py", line 193, in main
    cur_gpu = gpu_by_bus_num[bus_number]
KeyError: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "hiveos-exporter.py", line 232, in <module>
    main()
  File "hiveos-exporter.py", line 198, in main
    print(exception)
NameError: name 'exception' is not defined

I think the call to Miner.is_gpu_miner (), might have to change, but I'm not sure how... maybe do a cross reference of the busid ...

Not sure how i would fix this. - any suggestions?
json_files.zip

@snoby
Copy link
Author

snoby commented Jul 13, 2022

I hacked it to at least start. the bus id that gets returned is 0. I have the system key off of that to return a false if the bus id is 0. to skip that miner in the main loop. I will put together a pull request to show it, but my python skills are no where near as good as yours.

@heaje
Copy link
Owner

heaje commented Jul 14, 2022

@snoby - thanks for the JSON files. I’ll take a look at this either tonight or tomorrow. I don’t have an FPGA card, so this is a case I hadn’t written code for yet

@heaje
Copy link
Owner

heaje commented Jul 14, 2022

@snoby - I'm digging around in the various HiveOS bash scripts to figure out how an FPGA is reported to Hive. A few requests for you:

  • Is the FPGA reported in the HiveOS UI? If so, can you provide a screenshot of it as listed with the rest of the GPUs in the rig? I'm curious to see how the device address is reported with it being a USB device.
  • If the FPGA is NOT reported in the HiveOS UI, how are you configuring teamredminer in your flight sheet to use the FPGA?
  • Please provide the output of the command usb-devices

I have yet to find anything that looks FPGA specific in the various HiveOS detection scripts that look for hardware. Everything I'm finding is GPU specific. What I'm aiming for is some way to properly detect the FPGA from the monitoring script and provide all the labels that are also provided for the GPUs.

@snoby
Copy link
Author

snoby commented Jul 14, 2022

It's not listed nor detected in Hive. Only teamredminer sees it. It is a USB device that teamredminer knows about. What's interesting is that HiveOS is more than happy to let teamredminer start and run and report hashrate.

image

image

Output of usb-devices

root@AMD:~# usb-devices

T:  Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=480 MxCh= 2
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1d6b ProdID=0002 Rev=05.10
S:  Manufacturer=Linux 5.10.0-hiveos ehci_hcd
S:  Product=EHCI Host Controller
S:  SerialNumber=0000:00:1a.0
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub

T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 4
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=8087 ProdID=0024 Rev=00.00
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub

T:  Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=  3 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=174c ProdID=1153 Rev=00.01
S:  Manufacturer=ASMedia
S:  Product=AS2115
S:  SerialNumber=00000000000000000000
C:  #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

T:  Bus=01 Lev=02 Prnt=02 Port=01 Cnt=02 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=0403 ProdID=6011 Rev=08.00
S:  Manufacturer=Xilinx
S:  Product=A-U55N
S:  SerialNumber=XFL1XB5NUZGS
C:  #Ifs= 4 Cfg#= 1 Atr=80 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=usbfs
I:  If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=ftdi_sio
I:  If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=ftdi_sio
I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=ftdi_sio

T:  Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=480 MxCh= 2
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1d6b ProdID=0002 Rev=05.10
S:  Manufacturer=Linux 5.10.0-hiveos ehci_hcd
S:  Product=EHCI Host Controller
S:  SerialNumber=0000:00:1d.0
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub

T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 6
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=8087 ProdID=0024 Rev=00.00
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
root@AMD:~#

It's an odd / complete corner case.

This is the card:
https://www.xilinx.com/products/accelerators/varium/c1100.html

Teamredminer config info for the fpga.
https://github.com/todxx/teamredminer/blob/master/doc/FPGA_GUIDE.txt

I will add that the 0 that is the bus id, seems to be a default, as the usb bus id from the above output is actually "01"

 Bus=01 Lev=02 Prnt=02 Port=01 Cnt=02 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=0403 ProdID=6011 Rev=08.00
S:  Manufacturer=Xilinx
S:  Product=A-U55N
S:  SerialNumber=XFL1XB5NUZGS

@heaje
Copy link
Owner

heaje commented Jul 14, 2022

@snoby - I put in a minimal change that is likely similar to what you mentioned you had tried previously. In this case, the script still reports hashrate for the FPGA, but many of the labels are filled with "unknown". The card number is also just reported as the bus number from the last_stat.json (which in this case is "0").

I don't know for sure that is the correct approach, but can you test out #4 for me and validate that it works for you? If it does, I'll get it merged and then you can at least get monitoring you need without crashes.

As for proper FPGA support, I likely would need to have that type of hardware to really dive into how to detect it. I don't foresee getting that kind of hardware any time soon.

@heaje
Copy link
Owner

heaje commented Jul 14, 2022

I've been digging further and could use one last bit of information from you. On your rig with the FPGA, can you give me the output of echo '{"command":"summary+devs"}' | nc -w 10 localhost 65078 | jq? The last_stat.json output you gave me had zeroes for the hashrate. I'm curious to see what teamredminer itself reports via it's API.

I'm suspicious that HiveOS doesn't even properly pull out the stats from the miner for an FPGA device.

@snoby
Copy link
Author

snoby commented Jul 14, 2022

root@AMD:~# echo '{"command":"summary+devs"}' | nc -w 10 localhost 65078 | jq
{
  "summary": {
    "STATUS": [
      {
        "STATUS": "S",
        "When": 1657792952,
        "Code": 11,
        "Msg": "Summary",
        "Description": "TeamRedMiner 0.10.2"
      }
    ],
    "SUMMARY": [
      {
        "Elapsed": 71233,
        "MHS av": 70.59,
        "MHS 30s": 70.63,
        "KHS av": 70590,
        "KHS 30s": 70630,
        "Found Blocks": 0,
        "Getworks": 25774,
        "Accepted": 1163,
        "Rejected": 0,
        "Hardware Errors": 0,
        "Utility": 0.9796,
        "Discarded": 0,
        "Stale": 2,
        "Get Failures": 0,
        "Local Work": 0,
        "Remote Failures": 0,
        "Network Blocks": 25561,
        "Total MH": 5028024042527,
        "Work Utility": 0.9123,
        "Difficulty Accepted": 1083.11162715,
        "Difficulty Rejected": 0,
        "Difficulty Stale": 0,
        "Best Share": 0,
        "Device Hardware%": 0,
        "Device Rejected%": 0,
        "Pool Rejected%": 0,
        "Pool Stale%": 0,
        "Last getwork": 0
      }
    ],
    "id": 1
  },
  "devs": {
    "STATUS": [
      {
        "STATUS": "S",
        "When": 1657792952,
        "Code": 9,
        "Msg": "0 GPU(s) - 1 PGA(s)",
        "Description": "TeamRedMiner 0.10.2"
      }
    ],
    "DEVS": [
      {
        "PGA": 0,
        "Name": "C1100",
        "ID": 0,
        "Enabled": "Y",
        "Status": "Alive",
        "Temperature": 63.09,
        "MHS av": 70.59,
        "MHS 30s": 70.63,
        "KHS av": 70590,
        "KHS 30s": 70630,
        "Accepted": 1163,
        "Rejected": 0,
        "Hardware Errors": 0,
        "Utility": 0.9796,
        "Last Share Pool": 1,
        "Last Share Time": 0,
        "Total MH": 5028024042527,
        "Frequency": 578,
        "Diff1 Work": 1121.908951,
        "Difficulty Accepted": 1083.11162715,
        "Difficulty Rejected": 0,
        "Last Share Difficulty": 0,
        "Last Valid Work": 0,
        "Device Hardware%": 0,
        "Device Rejected%": 0,
        "Device Elapsed": 71233,
        "Memory Temperature": 80,
        "Fan Speed": 0,
        "Fan Percent": 0,
        "Memory Clock": 1250,
        "Core Voltage": 0.749,
        "BRAM Voltage": 0.854,
        "Memory Voltage": 1.197,
        "FPGA Power": 80.894976,
        "FPGA Activity": 0
      }
    ],
    "id": 1
  },
  "id": 1
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants