Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different Tcal[Jy] with DBBC3 FS 9.12.12 for BBCs with same sky RF, but using USB/LSB #97

Closed
varenius opened this issue Mar 2, 2021 · 10 comments

Comments

@varenius
Copy link
Contributor

varenius commented Mar 2, 2021

Before every DBBC3 experiment, I run "ONOFF" on Cas A to check the TCAL strength. For various reasons, the DPFU/gain used to Kelvin-conversion is not a priority, so instead I check and compare the Tcal[Jy] columns. Significant changes here indicate noise diode is not stable. (The Tcal[Jy] can be used to scale the TPI data, logged externally for now as per #32, given that we have a stable diode).

In standard VGOS observations, 64 BBC channels in the DBBC3 are used with 32 MHz bandwidth each. For recording, Oe/Ow antennas use the Upper Sideband (USB) for Band A (3-3.5GHz), and Lower Sideband (LSB) for bands B, C, D. Only one 32MHz sideband gets recorded in the VDIF; one of the BBC sidebands is filtered out by the VSI bitmask when the VDIF streams are created. However: For the FS ONOFF procedure, each BBC has both sidebands available for TPI measurements. This means that we, effectively, have 128 BBC channels for frequency measurements.

Checking these things in detail, I noted something curious. Because it so happens that the USB of some BBCs is the LSB of other BBCs, we get some lines in the ONOFF VAL output with multiple values for the same frequency. In theory, these should be exactly the same. However, I found that they can differ, see for example Ow b21060 logfile (see e.g. ftp://ivs.bkg.bund.de/pub/vlbi/ivsdata/aux/2021/b21060 for full Ow logfile) data for Band D, where we have the following doublets in band D (I picked this to mitigate any RFI-issues for my comparison):

X-pol LSB:
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 050l 7 l  10664.40 1.0015 57.93 1506.0 111.474  0.91
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 055l 7 l  10280.40 0.9980 46.00 1454.3 147.802  1.11
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 056l 7 l  10248.40 0.9978 45.94 1481.6 150.715  1.13
X-pol USB:
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 049u 7 l  10664.40 0.9974 54.40 1544.9 121.779  0.99
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 054u 7 l  10280.40 0.9975 45.37 1450.5 149.456  1.12
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 055u 7 l  10248.40 0.9963 45.11 1443.3 149.516  1.12
Y-pol LSB:
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 058l 8 r  10664.40 0.9960 53.36 1320.9 109.719  0.87
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 063l 8 r  10280.40 0.9968 42.53 1332.4 167.994  1.10
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 064l 8 r  10248.40 0.9904 43.70 1356.6 164.473  1.09
Y-pol USB:
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 057u 8 r  10664.40 0.9918 53.67 1431.2 118.188  0.93
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 062u 8 r  10280.40 0.9943 44.12 1334.1 162.123  1.06
/usr2/log/b21060ow.log:2021.060.16:07:17.55#onoff#VAL casa       299.5 58.4 063u 8 r  10248.40 0.9993 42.83 1339.0 165.652  1.09

(there are others overlapping USB/LSB pairs in other bands).

Interestingly, we see that for X-pol LSB 10664.40 we get Tcal 111.474 Jy, but for the same USB frequency we get 121.779 Jy, so a difference of about 10%. This is surprising to me, I thought these would be identical. Is this a bug/feature/limitation of the DBBC3/FS (still running 9.12.12 here)?

Theories: Even though the frequencies are the same, the BBCs are different. This means different signal paths (although same core3h board), different BBC gain levels (since the AGC would adjust for both BBC USB/LSB in total, and the total would be different from the two BBCs). Still, I would assume that the power we get would be the same, as it should be a physical power. Perhaps, given that the BBCs are in different orders, the readout time, and possible the set/reset time (for 0-level measurements, agcs etc.) is slightly different. Maybe all these factors could explain the behaviour. But, nevertheless, I find it curious.

One interesting test could be to tune all 8 BBCs of the core3h board (running v124 firmware here) to the same BBC freq, and see if they are identical or not. Then, shift half of them so the USB becomes LSB of the other 4, and check again. Then, do the same but shift down so LSB becomes USB. Maybe there is a pattern somewhere?

We should test this with FS10 at some point too.

@wehimwich
Copy link
Member

wehimwich commented Mar 2, 2021

It is an interesting situation.

Picking on 10664.4 (which has the largest differences in what you sent), I went through and verified the Tsys, SEFD, and Tcalj calculations by hand based on the information in the log. This is given in (gasp :) Issue97.xlsx. The Tcal and flux values were the same for 49u and 50l. The only thing left on the FS side to check is that it is handling the BBC responses correctly. If you re-run onoff with echo=on, we could check that.

From the spreadsheet, we can see that the off source values for both detectors are noisier than the on source values. Perhaps this is due to RFI, but the spread is still small. The differences between the two detectors for a given type of value are somewhat bigger, and probably the cause of the differences in the calculated values. Since the values don't differ by a constant, I am inclined to not think they are due to the actual zero levels of the detectors (which the FS does not measure) being different.

I agree that there could be differences due to being different devices and different timing. The FS doesn't actually measure the zero levels (there is no way I know of for the DBBC3). Since these are digital detectors the zero levels should be small, if not actually zero. Maybe upper/lower sideband bandpass shape could be an issue, if in fact they are different. The FS sets the gains to manual for the duration of the measurements. I think that must be working or the measurements would be a big mess.

Something we can do with the (in development) DBBC3 branch of the FS is to compare the BBC responses with the multicast values to see if they agree. Since the values from a single multicast message should all be from the same second, we could also verify whether they are producing the same values for overlaid sidebands in the same second. There could be both USB/LSB and USB/USB comparisons. I think this raw TP value check should be done before going onto other comparisons.

You might be able to do a partial raw TP check by hand with 9.12.12 for a couple BBCs, using a SNAP procedure that samples the two BBCs with overlaid sidebands in quick succession. You might have to try a few times to get samples from the same second.

It may make sense in the future to switch to using the multicast for the ONOFF measurements. It could potentially make them faster. However, it would probably not make the agreement between overlaid sidebands better unless the difference is driven by RFI, in which case, you could still have time variations.

@nvi-inc nvi-inc deleted a comment from haavee Mar 4, 2021
@varenius
Copy link
Contributor Author

I've had no time to dig into this yet, and likely not for a few weeks. But I just want to note that this issue happened with v124 of the DBBC3 firmware. I have not tested levels or TPI logging from v125 yet.

@wehimwich
Copy link
Member

wehimwich commented Mar 11, 2021

This post is a relevant digression into the read back of TP data. A later post will compare TP results from overlaid upper and lower sidebands. It might be awhile until the second one because we ran into some things we don't understand about v125. All of this was done with v125.

First, it seems that it can take some time, around 40 seconds in the case I looked at, for the gain to stabilize after a BBC frequency change. I only looked at a relatively "small" change of 32 MHz. The time involved probably isn't too surprising. Presumably there can be cases where it take more or less time. Once it stabilizes, two BBCs with the same frequencies get the same counts, generally:

2021.069.13:31:37.51/bbc041/ 859.600000,f,32, 1,agc,223,238,16233,16308,16154,16233
2021.069.13:31:37.53/bbc042/ 859.600000,f,32, 1,agc,223,238,16233,16308,16154,16233

However, there are still occasional differences at the single count level:

2021.069.13:31:52.54/bbc041/ 859.600000,f,32, 1,agc,226,241,16863,16901,16790,16827
2021.069.13:31:52.56/bbc042/ 859.600000,f,32, 1,agc,226,241,16864,16902,16791,16827

The can also be larger differences:

2021.069.13:31:55.36/bbc041/ 859.600000,f,32, 1,agc,224,239,16589,16637,16508,16563
2021.069.13:31:55.38/bbc042/ 859.600000,f,32, 1,agc,223,238,16428,16494,16345,16415

Note that the gains are also different in this case. The On-Off differences are nearly the same, differing by 2 for USB, and 5 for LSB. The latter is about 6%. However, this only seems to happen if the measurements are near what might be called the "update epoch" (see below). The antenna was looking at the ground these tests. That may have exacerbated the difference. (FYI, the order of the TP values is USBon, LSBon, USBoff, LSBoff.)

Second, it looks like the TP values from the BBCnnn commands are the same as from the multicast:

2021.069.13:22:05.56#dbtcn#tpcont/ 041l,15418,15341, 041u,15534,15453, 042l,15367,15298, 042u,15474,15395, if, 56072521, 55926651
2021.069.13:22:05.66/bbc041/ 827.600000,f,32, 1,agc,228,231,15534,15418,15453,15341
2021.069.13:22:05.68/bbc042/ 859.600000,f,32, 1,agc,232,246,15474,15367,15395,15298

(The tpcont/ count orders are on, then off.)

I am a little confused about the timing of this. The multicast arrives at about 570 ms into the second. After sampling repetItively, it looks like the BBC commands update their outputs at about 370 ms into the second, the "update epoch". For example:

2021.069.14:00:32.37/bbc041/ 859.600000,f,32, 1,agc,223,237,16281,16222,16203,16144
2021.069.14:00:32.39/bbc041/ 859.600000,f,32, 1,agc,223,237,16274,16224,16199,16147

Note that my example above for comparing to the multicast is well away from this transition point. It seems like there could be at least two scenarios: (1) everything changes at about 370 ms and that is when the multicast data is generated, or (2) internally it changes at the 1 PPS, but it takes 370 ms to get that into the BBCnnn response and multicast. Maybe the latter is more likely, but I don't think that matters for most purposes. However, if you are trying to scan across a source continuously (not in steps like fivpt), the position is changing relative to the source while you are sampling. I think you could measure it by scanning first from one side and then from the other; resolution might be low in terms of changing power levels though. It could be that the "update epoch" changes, but if it and the multicast are driven by the 1 PPS, it may be stable.

@varenius it would still be useful to redo your original measurement with echo=on. That would allow checking that the correct counts are picked up by onoff. It would also allow us to see what the timing is for the different BBCs. I think that is more likely to be the cause of the difference you saw. However, in what I have looked at for USB/LSB, it looks there may be some differences there as well. I hope to get back to that soon.

@wehimwich
Copy link
Member

Continuing the digression on TP read back, it seems that the "update epoch" depends on the BBC number. For example:

2021.080.23:30:57.13/bbc011/3384.400000,b,32, 1,agc,196,243,16078,15928,15855,15757
2021.080.23:30:57.16/bbc011/3384.400000,b,32, 1,agc,196,243,16065,15931,15852,15750
2021.080.23:31:58.56/bbc060/1079.600000,h,32, 1,agc,255,255, 6088, 5436, 6069, 5422
2021.080.23:31:58.58/bbc060/1079.600000,h,32, 1,agc,255,255, 6085, 5435, 6066, 5418
2021.080.23:30:07.27/bbc090/2011.600000,d,32, 1,agc,255,255, 2071, 2595, 2057, 2578
2021.080.23:30:07.30/bbc090/2011.600000,d,32, 1,agc,255,255, 2070, 2596, 2055, 2579
2021.080.18:10:19.55/bbc128/1367.600000,h,128, 1,agc,255,255,12036,10014,11974, 9961
2021.080.18:10:19.58/bbc128/1367.600000,h,128, 1,agc,255,255,12043,10015,11979, 9964

The transition times vary at the few 10s of ms level. The summary is:

bbc update epoch (ms)
011 160
041 370
060 570
090 280
128 570

This might be explained by there being two threads for updating the BBC command TPs: one for BBCs 1-64; the other, BBCs 65-128. With both starting at about the PPS epoch and requiring about 10 ms per BBC. Wild guess.

@wehimwich
Copy link
Member

wehimwich commented Apr 18, 2021

Since the equivalence of the BBC command and the multicast TP output, and additionally the Tsys calculation for the multicast, has apparently been established by #90, I thought I would try to look at this from the point-of-view of the multicast data. This test was done with DDC_V v124, which is hopefully no different for these purposes than DDC_U v125 used in that issue.

Starting with the setup loaded by Eskil (VGOS I think). I did a simple baseline test of two BBCs (23 and 24) with the same frequency (2360.6). It required about 25 seconds for the TP values to stabilize to agreement after the frequency was set, but then all four TP values agreed exactly between the two BBCs for the 10 or so further seconds I collected data. As would be expected, Tsys was also identical for those samples. So far so good.

Then I looked at the USB/LSB issue more using Eskil's choice of 049u and 050l (without changing frequencies, BBC049 at 919.6 and BBC050 at 951.6). The raw data is in 4950.txt. After deleting the two samples with overflows, there were 46 sets of multicast samples to compare. Neither the counts nor the Tsys values were equal for the overlaid USB and LSB channels. Summarizing the Tsys results from looking at the ground:

channel 049l 049u 050l 050u 049u-050l 049l-050u 049l-050l 049u-050u 049l-049u 050l-050u
Average 412.6 457.4 408.2 451.6 49.3 -39.0 4.5 5.8 -44.8 -43.5
RMS 43.0 50.8 44.1 59.1 71.1 31.6 45.3 75.5 61.7 47.8

The scatter of the individual channels is about 10%. The differences between the two channels that should nominally agree, 049u and 050l, have an even larger scatter. The difference between the two channels on the "wings", 049l and 050u, have a scatter that is smaller by about a factor of two. If significant, this seems odd. I saw similar behavior for two other BBCs I tested in this way. It seems unlikely that the channels could be mislabeled, but that should be easy to check by injecting a test tone.

I included the additional cross channel differences in the table out of curiosity. For this number of samples, the sigma of the mean is about a factor seven smaller than the RMS. Some of the average differences might be significantly different from zero. It seems odd that LSB-LSB and USB-USB differences seem closer to zero than the others, but maybe it is chance. OTOH, maybe there is something systematically different between USB and LSB channels.

It could be that I still don't have counts properly aligned in some way or that the assumption that the two firmware version behave in the same way is wrong. This may be the first time we have this situation with digital down conversion, settable channel frequencies, and simultaneous samples for an USB/LSB comparison.

It would be interesting to look at the results on the sky. A run of onoff with echo=on would be helpful to verify the decoding of the data by onoff. If possible, it would be helpful to get data not only for an overlaid USB and LSB, but also the other two sidebands for the BBCs involved (the "wings"). It would also be helpful to get both sidebands from each of two BBCs set to the same frequencies, i.e. with the same USBs and LSBs. In principle, one onoff run could provide all of that.

@varenius
Copy link
Contributor Author

I'm preparing a sky test with FS10 to continue this ticket. I know how to run onoff with echo=on. I know how to set BBCs. I can arrange overlaid USB and LSB (one BBC USB overlaps with LSB of another bBS), and I can arrange boths sidebands set to the same frequencies (so both USB and LSB overlap). But I'm not sure exactly what @wehimwich is after with "also the other two sidebands for the BBCs involved (the "wings")". Could you please clarify? Then I can get everything you want in one go.

@wehimwich
Copy link
Member

wehimwich commented Sep 10, 2021

I'm glad you will be able to look at this @varenius. What I meant was that if you have overlaid different sidebands for two different BBCs, say 049u and 050l (the "body"), then 049l and 050u would be the "wings." It was odd in the above data that the same sideband in two different BBCs agreed better than both the two overlaid channels and the two sidebands from the same BBC. Maybe that is the difference in the filter responses. I do (EDIT was: don't) think the sideband count labels being flipped could explain it.

@varenius
Copy link
Contributor Author

Here is a log with the VGOS setup but using FS10, and echo=on. I also set two BBCs to the same frequency, so full overlap. Hopefully this will provide the requested data. If you need more, please let me know.
oo01.log

@wehimwich
Copy link
Member

wehimwich commented Oct 4, 2021

Thank you @varenius. I looked through these data fairly thoroughly. My overall conclusion is that if there is a problem it is not in the FS. I think we can close this issue for the FS.

In oo01.log, there are two runs of onoff. The general scheme for the data is that that 49u and 59l are overlaid side-bands (therefore nominally the same RF bandpass, but reversed images) and BBCs 51 and 52 were at the same BBC settings as each other (so they have matched upper and lower sidebands). There were 60 more BBCs (120 more sidebands) in each run, but I did not look at those. The basic question we are interested in is how different are the overlaid sidebands.

I used a spreadsheet, oo01.xlsx, to compare the results. There are two sheets, one is VAL for the comparison of the VAL results; the other,RAW for verifying the results from the raw data to the VAL output. The VAL comparison is summarized below first.

I calculated the differences in the VAL (Tsys, SEFD, Tcal(j)) entries for 49u-50l (overlaid SBs), 49l-50u (the wings), 51l-52l, and 51u-52u. Also to compare sidebands within a BBC, I calculated the differences for 49l-49u, 50l-50u, 51l-51u, and 52l-52u. This is all in the VAL sheet of the spreadsheet and summarized below:

Percentage differences for the first onoff run:

49u-50l 49l-50u 51l-52l 51u-52u 49l-49u 50l-50u 51l-51u 51l-52l
Tcal(j) 8.9 0.3 0.0 -0.1 -4.6 -4.0 6.5 6.5
SEFD 3.2 -0.7 0.0 0.0 -3.2 0.4 -1.3 -1.3
Tsys -6.8 2.9 0.0 0.0 1.4 8.3 -10.0 -9.9

The second onoff run gave very similar results, which are shown here for completeness:

49u-50l 49l-50u 51l-52l 51u-52u 49l-49u 50l-50u 51l-51u 51u-52l
Tcal(j) 8.8 0.5 0.0 0.0 -4.0 -4.3 7.1 7.1
SEFD 1.9 -0.8 0.0 0.0 -3.1 0.4 -1.3 -1.3
Tsys -6.9 2.6 0.0 0.0 0.8 8.6 -10.6 -10.6

Quick comments:

  • Results between the two runs agree to within 0.6%, but usually better and often 0.1%.
  • BBCs 51 and 52 (third and fourth columns of numbers) agree with each other very well, as we would hope.
  • The differences between LSB and USB within a BBC (the last four columns) are in the 0.4-10.6% range. The differences for BBC 51 and 52 (the last two columns) are consistent, as we would expect.
  • The differences between the overlaid sidebands (49u-50l) are comparable to the differences between different sidebands within a BBC.
  • The differences between the wing sidebands (49l-50u) are smaller than for the overlaid sidebands (and for different sidebands within a BBC).

I think it is somewhat remarkable that the wing channels agree better than the overlaid sidebands. We had seen this before when the telescope was pointed at the ground. So it seems to be consistent behavior. It could be that this is just the way data are (for whatever reason), but if the sideband count data labels were reversed within a BBC or the FS reversed the data, the results for the wings would actually be for the overlaid sidebands. To check on this for the FS, I reviewed the onoff calculations from the raw data through the VAL results. The DBBC3 document shows the DBBCnn output as:

DBBCnn/freq,IF,bwd,tpint,gainctrl,gainU,gainL,tpU/calon,tpL/calon,tpUcaloff,tpLcaloff

So the calon data are first, then caloff; for a given cal state,Upper sideband comes first then Lower.

I recalculated the VAL results for the first onoff run for both sidebands of both BBC 49 and 50 starting from the raw counts (using the echo=on output). This is in the RAW sheet of the spreadsheet. This all looks correct. The recalculated values for Tsys, SEFD, and Tcal(j) (12 values total) agree with the onoff calculated values to 0.01% or better. That is, there are nearly five or more significant digits of agreement. This is at least the second time now that I have verified this for the DBBC3. I also verified the ONSO, ONSC, OFFC, and OFFS onoff output; so I think that provides some more general verification for onoff, if it has collected the right raw data.

NOTE: The flux values used in the calculations for 49l, 49u/50l, and 50u, are, respectively, 333.65, 334.55, and 335.45 (LSB first LO and BBC49 is set to a lower frequency than BBC50). The total range is about 0.6%. That variation does not seem to be a dominating contribution. If the 334.55 is used for the wings in the first onoff run, the difference of 49l-50u for SEFD improves to 0.1%; Tcal(j) gets worse, 0.9%. This is calculated in the RAW sheet.

In summary, I am inclined to think that the differences in the overlaid sidebands are just the way the DBBC3 behaves, maybe due to differences in bandpass shapes or other considerations. Sven might know. I don't think it is likely that it is caused by mislabeling of the count data by the DBBC3. If someone is interested and has a DBBC3 at their disposal, they might be able to verify that by injecting a signal that will raise the counts of just one sideband of a BBC.

@wehimwich
Copy link
Member

One thing I did not mention in the post above is that for the first onoff run the count data had values of around 6000. However, for the second run the counts were about 16000, which is the expected nominal value. The values being lower than nominal in the first run might be explained by the run being started soon after the telescope was moved away from ground and the gain not having fully settled. The gain is set to man during onoff and the released back to agc at the end. The nominal values observed in the second run might be explained by the two minute gap between the first and second onoff runs. That may have given the DBBC3 enough time to fully adjust the gains before the second run. The results of the VAL analysis are essentially the same for the first and second runs, so I don't think the gain levels affected the results. The RAW analysis, which only looked at the first run, is just a numerical check of the onoff calculation based on the data returned by the DBBC3, so whether the DBBC3 was at the correct gain level is irrelevant.

wehimwich added a commit that referenced this issue Apr 10, 2023
 1 FS_DBBC3_MULTICAST_BBC_TPI_USB_LSB_SWAP
 0 FS_DBBC3_MULTICAST_BBC_ON_OFF_SWAP
 0 FS_DBBC3_MULTICAST_CORE3H_POLARITY0_ON_OFF_SWAP
 1 FS_DBBC3_MULTICAST_CORE3H_POLARITY2_ON_OFF_SWAP
 0 FS_DBBC3_MULTICAST_CORE3H_TIME_ADD_SECONDS
   FS_DBBC3_MULTICAST_CORE3H_TIME_INCLUDED
     0 for DDC_V
:     1 for all others
 1 FS_DBBC3_MULTICAST_VERSION_ERROR_MINUTES
 1 FS_DBBC3_BBCNNN_TPI_USB_LSB_SWAP
 1 FS_DBBC3_BBCNNN_GAIN_USB_LSB_SWAP
 0 FS_DBBC3_BBCNNN_ON_OFF_SWAP
 0 FS_DBBC3_IFTPX_POLARITY0_ON_OFF_SWAP
 1 FS_DBBC3_IFTPX_POLARITY2_ON_OFF_SWAP
 1 FS_DBBC3_BBC_GAIN_USB_LSB_SWAP

Closes #97
Closes #192

Specifically the USB_LSB_SWAPS close those issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants