Spaces in data labels on #L and other lines #109

JPHammonds · 2017-06-29T16:59:34Z

I have run into a case where a beam line uses spaces in data labels (column names in data). This feeds over into some other lines which refer to these column labels such as #M & #@Roi). I have attached a file where scan 26 exhibits this problem. I see the problem when trying to parse #M line. This does seem to split the L line properly although I do not notice why in the code. On L lines, there are double spaces between column labels. On M line there is also double space between monitor counts and (channel name). Also for #@Roi.

lineup.zip

JPHammonds · 2017-06-29T17:02:26Z

I should note that on #@Roi the double space only appears before (channel name) not after.

prjemian · 2017-06-29T17:15:46Z

Double spaces within a data column label is a syntax error. A double space is a delimiter between data columns. This will not be changed in spec2nexus.

…

On Jun 29, 2017 12:02 PM, "John Hammonds" ***@***.***> wrote: I should note that on ***@***.*** <https://github.com/roi> the double space only appears before (channel name) not after. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#109 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACLKMPJqmTAN2x7mpLXikRcqUc1CmF2Oks5sI9iigaJpZM4OJmFN> .

JPHammonds · 2017-06-29T17:25:54Z

This is not double space in the data column, it is double space in the #L line, I think so that the spaces in column labels can be found. It is also in the #M and #@Roi. It seems to be in the files consistently. These are files generated by spec. Looks like in #L you use re.split(" +, strip_first_word(text)) to split the names.

JPHammonds · 2017-06-29T17:28:18Z

People should just not use spaces in labels in a text file, but spec has allowed it this long.

prjemian · 2017-06-29T17:36:09Z

Accepts that people do not follow rules. One space

…

On Jun 29, 2017 12:28 PM, "John Hammonds" ***@***.***> wrote: People should just not use spaces in labels in a text file, but spec has allowed it this long. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACLKMMsq62jJd5UQUomYgiHJcQD5qAJ-ks5sI96zgaJpZM4OJmFN> .

JPHammonds · 2017-06-29T17:44:49Z

At least so far I have not found an instance of double spaces in a name. Just places where a single space wreak havoc. On Jun 29, 2017, at 12:36 PM, Pete R Jemian <notifications@github.com<mailto:notifications@github.com>> wrote: Accepts that people do not follow rules. One space

prjemian · 2017-06-29T19:35:29Z

For #L lines, this splits on two or more spaces: scan.L = re.split(" +", strip_first_word(text))

prjemian · 2017-06-29T19:43:35Z

Re: SPEC data files rely on separating the data column labels with a two-space delimiter. One space is allowed in the column label, two spaces (or more) marks the start of a new column label. Yet this common practice, it was necessary to cover the special case where a scan used a single space as a delimiter.

prjemian · 2017-06-29T19:47:00Z

Not sure I understand how the #M line is involved. Syntax is #M num

prjemian · 2017-06-29T19:52:11Z

syntax for #@ROI line: is #@ROI n f l where n is a name and f and l are integers.

This is parsed with:

        scan.M, dname = strip_first_word(text).split()
        scan.monitor_name = dname.lstrip('(').rstrip(')')

The code expects there is no white space in the dname. To handle a name with space would be a change of the code which does not look difficult.

Can you post an example here?

prjemian · 2017-06-29T19:53:04Z

Above, I'm referring to the SPEC syntax, as described in spec2nexus docs.

JPHammonds · 2017-06-29T20:00:16Z

That is what the manual says but the files that I’m seeing look like: #M 800000 (Lock DC) I assume the thing in parentheses is the channel to monitor. Again my experience with SPEC is limited. I am not sure how common this is. I have not seen a #M in files from another beam line, although I do see some like #M 60000 (IC3) Which have no space in column name. John On Jun 29, 2017, at 2:47 PM, Pete R Jemian <notifications@github.com<mailto:notifications@github.com>> wrote: Not sure I understand how the #M line is involved. Syntax is #M num — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#109 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AIn0MKafbZzphucO9g0iHxMo2GsvLFLfks5sI_80gaJpZM4OJmFN>.

JPHammonds · 2017-06-29T20:03:01Z

I do not have a working example of this. I just saw the syntax as #@Roi PuMb(mca1R1) 239 259 and imaged a problem if the channel had a space in it. Note here PuMb(mca1R1) is the column name. John On Jun 29, 2017, at 2:52 PM, Pete R Jemian <notifications@github.com<mailto:notifications@github.com>> wrote: syntax for #@Roi line: is #@Roi n f l where n is a name and f and l are integers. This is parsed with<https://github.com/prjemian/spec2nexus/blob/master/src/spec2nexus/plugins/spec_common_spec2nexus.py#L412>: scan.M, dname = strip_first_word(text).split() scan.monitor_name = dname.lstrip('(').rstrip(')') The code expects there is no white space in the dname. To handle a name with space would be a change of the code which does not look difficult. Can you post an example here? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#109 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AIn0MNb6aVmVPS704xFhUHFLF7grQOotks5sJABrgaJpZM4OJmFN>.

JPHammonds · 2017-07-11T01:24:27Z

More food for thought here. The file that I supplied for #110 (I'll include here also) shows a column name after a #T line as well. This comes from a different APS beam line than the one mentioned above. I am not sure if this is a modification of SPEC to allow defining the column that did not make it into the user manual, or if this is simply a local customization that does not seem to bother SPEC.

bnt5bt.spec.zip

prjemian · 2017-07-11T03:27:53Z

There's a pattern emerging here.

People can't read the SPEC documentation.
The parser must be more robust for lines such as:

#T 0.5  (Seconds)
#M 200000 (I0)

The string provided in parentheses is not necessary since the standard macros put the column with the counting reference (time or monitor) ALWAYS in the next to last column.

Nevertheless, the parser should be changed to strip the #T or #M away and then parse only the next number. Anything else on the line should be ignored since it has no value.

@jkirchman: This is 8-ID-E writing this data file. They need some advice.

prjemian · 2017-07-11T03:39:01Z

That same file is even more troublesome (attn @jkirchman). Later, scan 26 starts with counting time of 0 seconds:

#S 26  xpcsscan 20 1
#D Wed Jun 07 23:31:38 2017
#T 0  (Seconds)

and then reports counting time of 1 second each point

#L img_n  Epoch  pind1  pind2  pind3  pind4  I_APS  cyber  cyber_u  cyber_l  atten  T_CTL  T_SAM  Seconds  ccdc
1 21903 0 1030 850 549 509029 0 2 2 0 0 27 1 25.37771
2 21905 0 1030 850 549 509004 1 2 3 0 0 27 1 51.346129

prjemian · 2017-07-11T04:15:10Z

The problem seems to be when the data name supplied on the control line in an unexpected place (such #T, #M, #@ROI, ...) has a space in it.

@jkirchman: Is the parenthesized data column name something new in SPEC? It is not listed in the SPEC page about the standard file content.

prjemian · 2017-07-11T04:32:47Z

@JPHammonds : the example file you provided for the #@ROI lines has these two ROIs:

FeKa(mca1R1)  MnKa(mca1R0)

Users should be cautioned against using data column names with special characters such as parentheses, braces, brackets, and lots of other characters that are often delimiters or decorators.

jkirchman · 2017-07-11T15:09:26Z

I checked this with numerous spec data files from several beamlines. I found files from as far back as 2010 that contained either #T or #M lines with the description in parentheses.
#T 1 (Seconds)
#M 100000 (IC6-B)
I believe this has been spec's normal behavior for quite some time now. I think the suggestion to ignore
anything after the first number following either #T or #M is a good one.

JPHammonds · 2017-07-11T15:12:10Z

Pete, I am not sure how to address this to the beamlines since the use is fairly common on some beam lines and has been for some time I believe. More specific to the use that you show below they append (+) and (-) to name columns corresponding to the different right/left spin. Perhaps a mention at the four way meeting? John On Jul 10, 2017, at 11:32 PM, Pete R Jemian <notifications@github.com<mailto:notifications@github.com>> wrote: @JPHammonds<https://github.com/jphammonds> : the example file you provided for the #@Roi lines has these two ROIs: FeKa(mca1R1) MnKa(mca1R0) Users should be cautioned against using data column names with special characters such as parentheses, braces, brackets, and lots of other characters that are often delimiters or decorators. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#109 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AIn0MFfyMYXuoGNxGUg7hevuBZv_NSO6ks5sMvrvgaJpZM4OJmFN>.

prjemian · 2017-07-11T15:13:25Z

Good suggestion. Best to be absolutely confident about how the SPEC manual addresses this first.

…

On 7/11/2017 10:12 AM, John Hammonds wrote: Pete, I am not sure how to address this to the beamlines since the use is fairly common on some beam lines and has been for some time I believe. More specific to the use that you show below they append (+) and (-) to name columns corresponding to the different right/left spin. Perhaps a mention at the four way meeting? John On Jul 10, 2017, at 11:32 PM, Pete R Jemian ***@***.******@***.***>> wrote: @JPHammonds<https://github.com/jphammonds> : the example file you provided for the ***@***.*** lines has these two ROIs: FeKa(mca1R1) MnKa(mca1R0) Users should be cautioned against using data column names with special characters such as parentheses, braces, brackets, and lots of other characters that are often delimiters or decorators. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#109 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AIn0MFfyMYXuoGNxGUg7hevuBZv_NSO6ks5sMvrvgaJpZM4OJmFN>. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#109 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACLKMIpsZykENBvTPv5SlX_L9XLPWUsJks5sM5DLgaJpZM4OJmFN>.

JPHammonds · 2017-07-11T15:17:21Z

This is for an XPCS scan where nothing is moving but the data is collected at some rate, presumably from timing pulse or a detector frequency. Not sure what is triggering the collection of data lines in the spec file for this case. As far as the analysis goes, I think time is pulled from the IMM files. John On Jul 10, 2017, at 10:39 PM, Pete R Jemian <notifications@github.com<mailto:notifications@github.com>> wrote: That same file is even more troublesome (attn @jkirchman<https://github.com/jkirchman>). Later, scan 26 starts with counting time of 0 seconds: #S 26 xpcsscan 20 1 #D Wed Jun 07 23:31:38 2017 #T 0 (Seconds) and then reports counting time of 1 second each point #L img_n Epoch pind1 pind2 pind3 pind4 I_APS cyber cyber_u cyber_l atten T_CTL T_SAM Seconds ccdc 1 21903 0 1030 850 549 509029 0 2 2 0 0 27 1 25.37771 2 21905 0 1030 850 549 509004 1 2 3 0 0 27 1 51.346129 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#109 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AIn0MB93A3wBGrjRoIcWk6J1eJspMaHOks5sMu5WgaJpZM4OJmFN>.

jkirchman · 2017-07-11T15:34:15Z

Checking the spec manual is advisable but please be aware that the spec user manual at certif.com is not updated as often as it should be. I would not call the website the definitive help that you might expect. The most effective help can be found through use of the help command in spec or the latest release notes

prjemian changed the title ~~Spaces in data labels on #L and other lines~~ Spaces in data labels on #L and other lines Jul 11, 2017

prjemian changed the title ~~Spaces in data labels on #L and other lines~~ Spaces in data labels on #L and other lines Jul 11, 2017

prjemian added a commit that referenced this issue Jul 11, 2017

#109: new tests pass with data file fragments

e0f5183

prjemian added a commit that referenced this issue Jul 11, 2017

#109 test fails with space in name

daac976

prjemian closed this as completed in ad5853b Jul 11, 2017

prjemian added this to the 2017-07 bugfix release milestone Jul 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spaces in data labels on #L and other lines #109

Spaces in data labels on #L and other lines #109

JPHammonds commented Jun 29, 2017

JPHammonds commented Jun 29, 2017

prjemian commented Jun 29, 2017 via email

JPHammonds commented Jun 29, 2017

JPHammonds commented Jun 29, 2017

prjemian commented Jun 29, 2017 via email

JPHammonds commented Jun 29, 2017 via email

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

JPHammonds commented Jun 29, 2017 via email

JPHammonds commented Jun 29, 2017 via email

JPHammonds commented Jul 11, 2017

prjemian commented Jul 11, 2017

prjemian commented Jul 11, 2017

prjemian commented Jul 11, 2017

prjemian commented Jul 11, 2017

jkirchman commented Jul 11, 2017

JPHammonds commented Jul 11, 2017 via email

prjemian commented Jul 11, 2017 via email

JPHammonds commented Jul 11, 2017 via email

jkirchman commented Jul 11, 2017

Spaces in data labels on #L and other lines #109

Spaces in data labels on #L and other lines #109

Comments

JPHammonds commented Jun 29, 2017

JPHammonds commented Jun 29, 2017

prjemian commented Jun 29, 2017 via email

JPHammonds commented Jun 29, 2017

JPHammonds commented Jun 29, 2017

prjemian commented Jun 29, 2017 via email

JPHammonds commented Jun 29, 2017 via email

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

prjemian commented Jun 29, 2017

JPHammonds commented Jun 29, 2017 via email

JPHammonds commented Jun 29, 2017 via email

JPHammonds commented Jul 11, 2017

prjemian commented Jul 11, 2017

prjemian commented Jul 11, 2017

prjemian commented Jul 11, 2017

prjemian commented Jul 11, 2017

jkirchman commented Jul 11, 2017

JPHammonds commented Jul 11, 2017 via email

prjemian commented Jul 11, 2017 via email

JPHammonds commented Jul 11, 2017 via email

jkirchman commented Jul 11, 2017