Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot understand the meaning of the amplitude. #9

Closed
Guolin-Yin opened this issue Apr 19, 2021 · 12 comments
Closed

Cannot understand the meaning of the amplitude. #9

Guolin-Yin opened this issue Apr 19, 2021 · 12 comments

Comments

@Guolin-Yin
Copy link

Thanks for your work, I am using your library to read CSI packets generated from Nexmon CSI, but I have some questions about the data.

  1. The channel statement information in 802.11 is calculated by H = Y/X, where H is the CSI. The value of this should have ranged from 0 to 1, 0 means the signal faded significantly and 1 means the signal has no fading after the channel. Due to the multipath channel, the value could be a little bit larger than 1. But the original data obtained by CSIKit are very big, the image below shows the CSI amplitude data in a linear format.
    image
  2. The data looks reasonable after setting the scaled to True, csi_data = my_reader.read_file("DATASET-pi-2\Empty-21.pcap", scaled=True). Could you explain how is the scale works?

Thanks in advance!!

@Guolin-Yin Guolin-Yin changed the title Cannot understand hte meaning of the amplitude. Cannot understand the meaning of the amplitude. Apr 19, 2021
@Gi-z
Copy link
Owner

Gi-z commented Apr 19, 2021

Hi Colin,

Thanks for your issue, I really appreciate any opportunity to see how people want to use this library.

  1. I can't claim to be an expert in the area, but I'll do my best to try and answer the first question! You're absolutely right about CSI being calculated as H = Y/X, however while theoretically perfect CSI as described in the spec will follow this 0-1 range, the CSI output by Nexmon CSI (and other commercial off-the-shelf CSI drivers/solutions) has been processed by other systems before it reaches Nexmon's UDP packets. Most notably, this includes automatic gain control (AGC), but Broadcom's internal scaling and other processing factors may also affect it. I see you made an issue on the nexmon_csi repo too, and I hope you'll receive a more definitive answer there!

  2. To get some context for the scaled keyword in Reader, it may be worth briefly familiarising yourself with the Linux 802.11n CSI Tool's FAQ page. This is the most well-established commercial off-the-shelf CSI collection solution which has some very useful documentation. The most relevant parts follow:

csi is the CSI itself, normalized to an internal reference. It is a Ntx×Nrx×30 3-D matrix where the third dimension is across 30 subcarriers in the OFDM channel. For a 20 MHz-wide channel, these correspond to about half the OFDM subcarriers, and for a 40 MHz-wide channel, this is about one in every 4 subcarriers. Which subcarriers were measured is defined by the IEEE 802.11n-2009 standard (in Table 7-25f on page 50).

Now that we've described all the fields of this struct, we need to put them all together to compute the CSI in absolute units, rather than Intel's internal reference level. In particular, we need to combine the RSSI and AGC values together to get RSS in dBm, and include noise to get SNR. If there is no noise, as in the sample case, we instead use a hard-coded noise floor of -92 dBm. We use the script get_scaled_csi.m to do this:

Fully reversing the scaling which occurs on the Intel IWL5300 is possible thanks the the calculations provided in the linked script. The Linux 802.11n CSI Tool data format includes the values used for the AGC in the Intel driver, while the script also includes some constants used by Intel within their code.

We don't have quite the same luxuries when it comes to Broadcom CSI data produced by nexmon_csi, so issues like that which you're having can occur.

The scaled option on the Reader class in CSIKit is passed to the specific subclass used for each different file format. For Intel, it'll perform the calculations described above. For Nexmon, a specific variant of the nexmon_csi driver (by @mzakharo, found here) can be used to extract RSSI values for each CSI frame. My scaling solution divides the CSI values by the RSSI for each frame in order to establish a more consistent scale. This won't remove the impact of AGC, however it does aim to lessen it in some ways.

Let me know if you have any further questions (or corrections!).

@Guolin-Yin
Copy link
Author

Thanks for your answers, it completely answered my questions. I am using the Raspberry pi to collect CSI for activity detection, so I need to make sure the CSI value correct. My dataset was collected using this branch. As you mentioned

For Nexmon, a specific variant of the nexmon_csi driver (by @mzakharo, found here) can be used to extract RSSI values for each CSI frame.
I am not sure if I can use the scaled on my dataset to perform correct scaling since I'm not sure the if RSSI value can be extracted from the .pcap file of my dataset.

@Gi-z
Copy link
Owner

Gi-z commented Apr 19, 2021

Unfortunately that branch is based on the core nexmon_csi codebase, and not the the pull request I linked. The data you're recorded so far doesn't contain the RSSI stored for each frame. You would need to reinstall from that pull request (cloning that forked version of the repository) and recollect data. Hopefully Seemoo will consider merging that pull request in the future.

@Guolin-Yin
Copy link
Author

@Gi-z Thanks for your help. I did some experiments, e.g., increasing the distance between two Rasp pi from 1m to 10m, the amplitude of the CSI almost the same due to the compensation of the AGC, even when the distance is fixed, the amplitude value may vary a lot as well. As you mentioned,

My scaling solution divides the CSI values by the RSSI for each frame in order to establish a more consistent scale. This won't remove the impact of AGC, however, it does aim to lessen it in some ways.

Since I would like to keep the distance information between transmitter and receiver, and I hope it wouldn't introduce too much error information to my data. Therefore, I wonder:

  • To what extent does this method reduce the impact of AGC?
  • Do you know any method to shut down the AGC of the Rasp pi or fix the gain to a specific value?

Thanks in advance, your help would be valuable for me.

@Gi-z
Copy link
Owner

Gi-z commented May 3, 2021

Hi Colin, I had a brief look into this.

image
This image shows a section taken from CRISLoc. It appears the scaling method I implemented is a halfstep, and I need to adjust the formula I'm using to reverse the AGC. Going to try some proper experiments with this tomorrow and I'll let you know what I find. If this resolves the issue then I'll get it added to the scaling=True functionality.

@Guolin-Yin
Copy link
Author

Hi Glenn. Thanks for sharing this wonderful paper, I am reading this paper now, it seems valuable to our research, looking forward to hearing from you. Thanks a lot.

@Gi-z
Copy link
Owner

Gi-z commented May 6, 2021

Hi Colin,

I've been looking into this quite a bit, definitely having trouble proving that the CRISLoc method does in fact reverse the AGC in any way. Their paper refers to Figure 5 as proof, which isn't something easily reproducible. They do have their data publicly available, however the CSI values aren't in their complex form and I'm having difficulty assessing what they've done with them in the first place. They do supposedly provide orig and scale samples, however I have similarly had trouble identifying the single coefficient used to scale corresponding frames. While I can implement the equation from the paper with ease, it won't do much good if I can't prove that it does in fact reverse the AGC.

Their key assumption relies on "...the sum of CSI squared over all the subcarriers should be consistent with RSS.". This seems like the best place to start in proving this approach. I've had little success so far in actually proving these two metrics are in any way correlated, using data examples from Intel IWL5300 and the Pi 4 (BCM43455c0).

However, with the ESP32, I've been able to visualise some form of pattern. I captured CSI on the ESP32 with a packet generator at the other side of the room, and moved towards the generator and back 3 times.

image
^ Sum of the Squared CSI across all subcarriers.

image
^ RSS.

There does indeed seem to be a pattern here, and it's notable that while the sum of the squares of CSI usually remains in some way consistent, it does in fact spike when I move closer to the packet generator.

It may be worth moving this discussion to the nexmon_csi issues, as this is a far-reaching issue which affects most of us using off-the-shelf CSI. Notably, if this method for reversing the CSI is reliable enough to be used (even if it's less accurate than that which would be possible with the AGC state/value) then this affects all commercial off-the-shelf CSI, not just that of nexmon-capable devices. For example, the CSI generated by the ESP32.

@Guolin-Yin
Copy link
Author

..the sum of CSI squared over all the subcarriers should be consistent with RSS

My understanding to this assumption from the paper, is that, the sum of squared of real CSI (before AGC) is consistent with RSS, but with the scaling of the AGC, the amplitude of the sum of squared CSI should remain relatively stable when you move your pi away or close to the transmitter.
The RSSI is just a average power over a packet, and the sum of the amplitude square can also represent the power of a packets. In OFDM, the power of a packet is distributed on subcarriers, (may not evenly distributed according to the channel condition) hence sum of the power of all the subcarrier (before AGC) should be the same as RSSI. Therefore, if you use the CSI directly read from the packet for the test, it shouldn't consistent with RSSI.
That's my understanding, please correct me if any wrong

@Gi-z
Copy link
Owner

Gi-z commented May 7, 2021

Excellent, I've completely misread it. Makes much more sense that the assumption is based on the real CSI rather than the CSI after AGC. I'm still having trouble getting such behaviour from Intel CSI data (where we have the data before and after AGC reversal), which would be very useful for proving that this works.

However, applying the CRISLoc formula to the ESP32 data I posted above, the data does appear to be following a similar curve to the RSS. This is very good to see!
image
image
image

I'm hoping to see some more positive results today in trying to demonstrate that this underlying assumption is valid in an instance where we know the unscaled and scaled data. For instance, if we can estimate the AGC value which should be applied (given the RSS) then I'd feel more confident saying this is in some way reversing the AGC process.

@reneprins4
Copy link

Following the above thread I am trying to visualize the CSI (esp32 data) as you have shown in the visualizations. Could you share the code so I could complete the visualization?

@Gi-z
Copy link
Owner

Gi-z commented May 17, 2021

Following the above thread I am trying to visualize the CSI (esp32 data) as you have shown in the visualizations. Could you share the code so I could complete the visualization?

Hi, yes, I'll provide both code and the data used for this example.
Data: https://drive.google.com/file/d/1MUuTdWIGMB03Vv1avKK6PBdga0bKS4A2/view?usp=sharing

from CSIKit.reader import CSVBeamformReader
from CSIKit.util.matlab import db, dbinv

import matplotlib.pyplot as plt
import numpy as np

def scale_csi_frame(csi: np.array, rssi: int) -> np.array:
    # This is not a true SNR ratio as is the case for the Intel scaling.
    # We do not have agc or noise values so it's just about establishing a scale against RSSI.

    # TODO: Find out whether it's RSS or RSSI being supplied with most CSI Frames.
    # I thought RSSI wasn't negative?

    subcarrier_count = csi.shape[0]

    rssi_pwr = dbinv(int(rssi))

    csi_sq = np.multiply(csi, np.conj(csi))
    csi_pwr = np.sum(csi_sq)
    csi_pwr = np.real(csi_pwr)

    # This implementation is based on the equation shown in https://doi.org/10.1109/JIOT.2020.3022573.
    # scaling_coefficient = sqrt(10^(RSS/10) / sum(CSIi^2))

    scale = rssi_pwr / (csi_pwr / subcarrier_count)
    return csi * np.sqrt(scale)

reader = CSVBeamformReader()
csi_data = reader.read_file("rss_test.csv")

rssis = [x.rssi for x in csi_data.frames]
sub1_data_unscaled = [db(abs(x.csi_matrix[-1])) for x in csi_data.frames]
sub1_data_scaled = [db(abs(scale_csi_frame(x.csi_matrix, x.rssi)[-1])) for x in csi_data.frames]

plt.title("Sub1 Unscaled")
plt.plot(sub1_data_unscaled)
plt.show()

plt.title("RSS")
plt.plot(rssis)
plt.show()

plt.title("Sub1 Scaled")
plt.plot(sub1_data_scaled)
plt.show()

@Gi-z
Copy link
Owner

Gi-z commented May 19, 2021

There is good reason to believe that AGC process is being reversed with the CRISLoc method, however without a measure of CSI before AGC has been applied there isn't definitive proof. I've got a couple of experiments planned to test this out, however it'll be a little while before I can get the proper equipment. Hoping to have something on this soon. I'll implement CRISLoc scaling as an optional feature in CSIKit shortly.

Going to close this issue now, but feel free to re-open and/or let me know if there's anything else I can clear up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants