Regression: handling of streams with same name and host #31

jfrey-xx · 2020-07-20T09:39:24Z

Hello,

I am doing experiment involving multiple users, I use the LSL stream name to differentiate users in the system and in the recordings. Each user has a dedicated device, and several physiological signals running on it. Hence I happen to have LSL streams with same name, same hostname, different types, different ID. Let say I have 3 streams for one user.

In prior versions of LabRecorder, e.g. 1.12 from the (old) FTP I could see in the window the correct number of streams (3), even though because only name and hostname are listed it is difficult (if not impossible) to differentiate which physiological signals I am dealing with. Still, I could make sure that I record everything.

Starting with the 1.13 releases on github, I can only see one line per name/hostame in the GUI. Ticking it usually record all related streams (and I could see in the debug info of the terminal that 3 streams are being recorded), but it happened that I was missing some streams in the resulting xdf file -- maybe due to this odd behavior of the GUI, maybe due to an error on my side.

I managed to find the commit that changed the behavior of the GUI: 588243a

Because reverting to the old behavior (listing old streams but with duplicated info in the GUI) is not ideal, would it be possible to detail, instead of just name + hostame, all the info, e.g. name + type + hostname (and maybe ID as well, just to make sure?). Doing so will probably help to fix the bug :) Extra details such as sampling rate could also be useful, but it might start to get crowded in the UI.

The text was updated successfully, but these errors were encountered:

tstenner · 2020-07-20T13:53:15Z

That's one part of a grant proposal I'm working on, only a bit more sophisticated. Basically, it allows arbitrary queries for streams with name + hostname as the default. Even before the change, LabRecorder used Name@Host internally, so it was assumed that streams on the same host have different names.

cboulay · 2020-07-20T14:22:04Z

And from someone who writes software for analyzing xdf files: please don't make it easier to have multiple streams that are difficult to disambiguate.

@jfrey-xx , the type should, as much as possible, be from a list of predefined types for which there is a corresponding metadata specification. (We are happy to add more types if none of these are suitable).

There might be a scenario where someone has 2 different streams generating event markers, e.g., one from a game, one from an input device. Thus, the stream-name & stream-type combination would not be unique if we co-opted stream-name to represent subject-name: [{'name': 'Chad', 'type': 'markers'}, {'name': 'Chad', 'type': 'markers'}]. Instead, stream-name usually contains some information about the device or application it's coming from. So in the above example, we might have [{'name': 'UnityEvents', 'type': 'markers'}, {'name': 'GamepadButtons', 'type': 'markers'}].

I've never recorded from multiple subjects before, but I sympathize that it would be nice to be able to differentiate between them in LabRecorder's interface. In that case it should be enough to augment the stream name with a subject id: `GamepadButtons-P002'.

By the way, there has been some brief discussion about a tool to anonymize xdf files. Some of the things that the tool would do is strip out any subject metadata and modify all the timestamps and recorded clock offsets to some fictional base t=0. I don't think the tool would touch the stream names, so if you are storing identifiable info in the stream name then you would have to go through each file manually and modify them if you wanted to anonymize them.

agricolab · 2020-07-21T12:26:20Z

Until this is fixed, an option is using the LabRecorderCLI. For example, i mocked two EEG streams with hostname, type and name and most other fields identical:

python -c "import pylsl; [print(s.as_xml()) for s in pylsl.resolve_streams()]"

returns

<?xml version="1.0"?>
<info>
	<name>Liesl-Mock-EEG</name>
	<type>EEG</type>
	<channel_count>8</channel_count>
	<nominal_srate>1000</nominal_srate>
	<channel_format>float32</channel_format>
	<source_id>8744345469393</source_id>
	<version>1.1000000000000001</version>
	<created_at>4739480.0292954333</created_at>
	<uid>7a8a95c8-a3e6-47f2-bf94-5bfce350c89c</uid>
	<session_id>default</session_id>
	<hostname>rgugg-desktop</hostname>
	<v4address />
	<v4data_port>16572</v4data_port>
	<v4service_port>16572</v4service_port>
	<v6address />
	<v6data_port>16573</v6data_port>
	<v6service_port>16573</v6service_port>
	<desc />
</info>

<?xml version="1.0"?>
<info>
	<name>Liesl-Mock-EEG</name>
	<type>EEG</type>
	<channel_count>8</channel_count>
	<nominal_srate>1000</nominal_srate>
	<channel_format>float32</channel_format>
	<source_id>8754617174481</source_id>
	<version>1.1000000000000001</version>
	<created_at>4739492.5496216211</created_at>
	<uid>a9f7e588-061a-401b-b3de-c645111915a0</uid>
	<session_id>default</session_id>
	<hostname>rgugg-desktop</hostname>
	<v4address />
	<v4data_port>16574</v4data_port>
	<v4service_port>16574</v4service_port>
	<v6address />
	<v6data_port>16575</v6data_port>
	<v6service_port>16575</v6service_port>
	<desc />
</info>

Calling LabRecorderCLI from the terminal like this:

LabRecorderCLI test.xdf "type='EEG'"

gives me a nice readout as follows:

Found Liesl-Mock-EEG@rgugg-desktop matching 'type='EEG''
Found Liesl-Mock-EEG@rgugg-desktop matching 'type='EEG''
Starting the recording, press Enter to quit
Opened the stream Liesl-Mock-EEG.
Opened the stream Liesl-Mock-EEG.
Received header for stream Liesl-Mock-EEG.
Received header for stream Liesl-Mock-EEG.
Started data collection for stream Liesl-Mock-EEG.
Started data collection for stream Liesl-Mock-EEG.

By inspecting the stdout, one can check whether sufficient streams are being recorded. This can even be automated and wrapped in a program language of your choice (see e.g. the discussion on #21)

jfrey-xx · 2020-07-22T06:44:53Z

Thanks for the different pointers -- especially considering that this is a limit use-case!

I did not think about using the CLI, I will have a look; with the stdout of the GUI as well, when launched from a terminal, it is possible to determine the number of streams being recorded, but I like better the scripting approach :)

The types of information I am dealing with (e.g. raw and processed heart rate and breathing) is not yet listed in the wiki, I'll watch it closely in case other physiological signals get "standardized".

Concerning the names, using prefix such as "GamepadButtons-P002" could be indeed a workaround, but in practice it would at the moment induce to much overhead in our associated pipeline in order to parse the string (we are actually building multi-user biofeedback applications in mixed reality, across different platforms and systems, there's more than one code to change ^^). We'll try to think of a better approach to differentiate users, though, maybe using metadata (I discarded this approach until now because I was facing issues with reading LSL metadata in a multi-threading environment, at least in python, and I had to resort to manual parsing of the XML to avoid crashing the script -- should probably open an dedicated issue there 😅 ).

I am not sure up to which point you want to enforce the "unique stream names per hostname" policy, if you do not want to alter how LabRecorder is using Name@Host as some sort of ID, maybe in the meantime you could specify it in the documentation, at either LabRecorder or LSL level?

PS: an option to anonymize XDF files would be nice indeed for sharing data, there is no info about the actual identity of the users in our stream names, we'll try to keep it that way.

cboulay added a commit that referenced this issue Aug 12, 2020

Refactored how streams are tracked. Fixes #21 Fixes #29 Fixes #31

4a018cd

cboulay mentioned this issue Aug 12, 2020

Refactor stream tracking #38

Merged

cboulay closed this as completed in #38 Sep 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression: handling of streams with same name and host #31

Regression: handling of streams with same name and host #31

jfrey-xx commented Jul 20, 2020

tstenner commented Jul 20, 2020

cboulay commented Jul 20, 2020

agricolab commented Jul 21, 2020 •

edited

jfrey-xx commented Jul 22, 2020

Regression: handling of streams with same name and host #31

Regression: handling of streams with same name and host #31

Comments

jfrey-xx commented Jul 20, 2020

tstenner commented Jul 20, 2020

cboulay commented Jul 20, 2020

agricolab commented Jul 21, 2020 • edited

jfrey-xx commented Jul 22, 2020

agricolab commented Jul 21, 2020 •

edited