Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences in cpymad table.twiss and madx table(twiss, X, X) when running use #93

Closed
JoschD opened this issue Jul 16, 2021 · 7 comments · Fixed by #94
Closed

Differences in cpymad table.twiss and madx table(twiss, X, X) when running use #93

JoschD opened this issue Jul 16, 2021 · 7 comments · Fixed by #94

Comments

@JoschD
Copy link

JoschD commented Jul 16, 2021

I see differences when getting values from table.twiss.dframe().loc[X, X] vs running table(twiss, X, X) in madx (e.g. via input()) if there is a use in between the last twiss and getting the values.

These differences can be quite substantial, but I have not managed to create a minimum working example with large differences.
Here is one that shows the problem, but the differences are not as big as I can see in some of my scripts:

#%%
# ------- Setup --------
from cpymad.madx import Madx
madx =  Madx()
mvars = madx.globals  # shorthand

madx.call("/afs/cern.ch/eng/lhc/optics/runII/2018/toolkit/macro.madx")
madx.call("/afs/cern.ch/eng/lhc/optics/runII/2018/lhc_as-built.seq")
madx.call("/afs/cern.ch/eng/lhc/optics/runII/2018/PROTON/opticsfile.22_ctpps2")
mvars.on_sep1 = 0.  # creates more obvious differece

madx.beam(
    sequence='lhcb1', bv=1,
    energy="NRJ", particle="proton", npart=1e10,
    kbunch=0, ex=7.29767146889e-09, ey=7.29767146889e-09
)
madx.use(sequence='lhcb1')

#%%
# ------ Actual Test ------
madx.twiss(sequence="lhcb1")
madx.use(sequence="lhcb1")  # This seems to be the culprit
# madx.twiss(sequence="lhcb1")  # Also fixes the problem

twiss_df = madx.table.twiss.dframe()
mvars["from_dframe"] = twiss_df.loc["ip1", "x"]
madx.input("from_table = table(twiss, ip1, x);")

print(mvars["from_dframe"])
print(mvars["from_table"])

Output:

-9.04026208843632e-11
-3.822658682068302e-11

upon commenting out the madx.use() or doing another madx.twiss() afterwards :

-3.822658682068302e-11
-3.822658682068302e-11
@ydutheil
Copy link

ydutheil commented Jul 16, 2021

Hi,

I remember facing something similar and since then I never use the madx.table.twiss because I am unsure what remains there. I think the point is that USE creates the sequence anew, and seems for some reason to change the row_names of the table.

What do you think using twiss_df = madx.sequence['lhcb1'].twiss_table.dframe() instead, this is what I usually do and in your example it raises an exception on the twiss table being invalid.

below a not so minimal example

from cpymad.madx import Madx


mqk, mql, length = 1.0, 0.1, 5.3

madx = Madx()
madx.beam(particle='proton', pc=1000)

madx.command.sequence.clone('S1', l='{:10.6e}'.format(length))
madx.elements.marker.clone('s1start', at=0)
madx.command.endsequence()

madx.command.quadrupole.clone('QF', l=mql, k1=mqk)
madx.command.quadrupole.clone('QD', l=mql, k1=-mqk)
madx.command.rbend.clone('dip', l=0.1, angle=10e-3)


madx.command.seqedit(sequence='S1')
madx.command.install(element='QF1', class_='QF', at='{:10.6e}'.format(mql/2))
madx.command.install(element='B1', class_='dip', at=mql+0.1/2)
madx.command.install(element='QD1', class_='QD', at='{:10.6e}'.format(length/2-mql/2))
madx.command.endedit()



madx.use(sequence='s1')
madx.twiss()

print(madx.table.twiss.row_names())
print(madx.table.twiss.name)
print(madx.sequence['s1'].twiss_table.row_names())

madx.options(debug=True)
madx.use(sequence='s1')
print(madx.table.twiss.row_names())
print(madx.table.twiss.name)
print(madx.sequence['s1'].twiss_table.row_names())

@JoschD
Copy link
Author

JoschD commented Jul 16, 2021

Hey, thank you for your answer.
That looks like a safer way of doing things indeed!

But, workarounds aside, this still looks like a bug as at the same state of the madx-instance
table(twiss, X, X) works while none of the other two options (table.twiss or sequence['lhcb1'].twiss_table) result in correct values (at least the second one let's you know about it, though...)

@coldfix
Copy link
Member

coldfix commented Jul 16, 2021

Hi,

interesting. I've reproduced using a smaller example (since I don't have or want to install afs here).

from cpymad.madx import Madx

madx = Madx()
madx.input("""
beam;

mqf.k1 =  0.3037241107;
mqd.k1 = -0.3037241107;

fodo: sequence, l=10, refer=entry;
mqf: quadrupole, at=0, l=1, k1:=mqf.k1;
dff: drift,      at=1, l=4;
mqd: quadrupole, at=5, l=1, k1:=mqd.k1;
dfd: drift,      at=6, l=4;
endsequence;

use, sequence=fodo;
twiss, sequence=fodo, x=0.1;
""")

before = madx.table.twiss.dframe()
madx.use(sequence="fodo")
after = madx.table.twiss.dframe()

print("row names before:", before.index.tolist())
print("row names after: ", after.index.tolist())

print("name column before:", before.name.tolist())
print("name column after: ", after.name.tolist())

print("x column before:", before.x.tolist())
print("x column after: ", after.x.tolist())

madx.input("from_table = table(twiss, mqf, x);")
print(madx.globals['from_table'])

From this we can see that the table values are still correct, just the row names get messed up by the use call. There is also a comment in the MAD-X use_sequ function in src/mad_seq.c: calling use screws up any twiss table calculated..

I believe the reason for that is that the table row names are assigned as raw pointers to the current node's name when adding a row, see src/mad_table.c:

void
augment_count(const char* table) /* increase table occ. by 1, fill missing */
{
  ...
  if (t->node_nm != NULL)
  {
    t->node_nm->p[t->curr] = current_node->name;
    t->node_nm->curr = t->curr;
  }
  ...
}

Now, after calling use, the sequence's nodes may be invalidated, and hence these pointers are not valid anymore. We could maybe even get segmentation faults from trying to read these values. Hence, using node_nm is not safe after use.

The reason that MAD-X's table(twiss, mqf, x) still works is that it doesn't use node_nm to look up the row, but seems to try all string columns instead, see function table_row(table, name) in src/mad_table.c.

The approach MAD-X is taking (i.e. trying to find the element name in arbitrary string columns) is not a good idea IMO, and not even applicable here (since we have to decide on one column to be used for the dataframe's index).

Hence, this is probably something to be added to "known issues" that it is unsafe to use tables after use.

A possible workaround would be:

twiss_table = madx.table.twiss
twiss_df = pd.DataFrame(
    twiss_table.copy(),
    index=np.char.partition(twiss_table.name, ":")[:, 0])

We could integrate using this workaround by adding an index argument to the Table.dframe() method that allows specifying a column name to be used as index instead of node_nm. What do you think?

@JoschD
Copy link
Author

JoschD commented Jul 17, 2021

Thank you very much for looking that deep into it (and for providing your take on the smallest example), @coldfix . I like your workaround idea, if the data is still in place? It didn't sound too promising, if the nodes are rebuild. Can we trust any data in the table.twiss after use?
I also forwarded this issue to @tpersson , in case he sees some obvious problem from the mad-x side with this.

@tpersson
Copy link
Collaborator

Hi,
I think this part of connecting the table to the nodes is used for the plotting and would be rather difficult to change since still, quite a few people are using the plotting. I think @coldfix solution works fine and I would recommend going for that in cpymad.

@coldfix
Copy link
Member

coldfix commented Jul 22, 2021

Can we trust any data in the table.twiss after use?

As far as I see, the data in the tables are not touched by USE, it's just that the sequence nodes get recreated which is a problem only for the node_nm pointer that is used by cpymad for the index.

coldfix added a commit that referenced this issue Aug 12, 2021
coldfix added a commit that referenced this issue Aug 12, 2021
coldfix added a commit that referenced this issue Aug 12, 2021
coldfix added a commit that referenced this issue Aug 12, 2021
coldfix added a commit that referenced this issue Aug 12, 2021
@JoschD
Copy link
Author

JoschD commented Aug 13, 2021

Thank you so much!
This is an amazing package and I am very thankful that you are still maintaining it!

coldfix added a commit that referenced this issue Sep 1, 2021
API changes:

- ``Table.selected_rows()`` now actually returns the *indices* of the selected
  elements as documented, rather than returning a boolean mask.
- ``Madx.eval()`` does no automatic syntax checking anymore. This is a minor
  performance improvement and is more consistent with ``Madx.input()`` which
  doesn't check the syntax either. Expressions can still be checked manually
  using ``cpymad.util.check_expression``

New features:

- (#90) Add comparison operators for ArrayAttribute (see #89)
- (#94) Add keyword argument ``Table.dframe(index=..)`` to allow specifying
  a column or sequence as the DataFrame index rather than using the default
  (``row_names()``). This is essential when accessing a table after having
  executed a ``USE`` statement (see #93).
- (#97) Add basic support for unexpanded nested sequences by returning
  them as elements of type ``Sequence`` from ``Sequence.elements`` (see #76)
- Add keyword argument ``Madx(prompt=...)`` as a shortcut for the most common
  ``CommandLog`` use case
- (#99) Add method ``Table.column()`` to retrieve specified rows in a specific
  column.
- (#99) Add ``rows`` and/or ``columns`` arguments to several ``Table`` methods
  to allow querying only specific columns or rows from the MAD-X process
- (#99) Add method ``Table.selection()`` that returns a new ``Table`` object
  which exposes only those rows/columns marked by a previous ``SELECT`` statement
  (see #98)
- Fix ``Madx.eval()`` to handle function calls, e.g.: ``sin(...)`` or ``table(...)``

Bug fixes:

- (#95) Fix ``KeyError`` when accessing tables after ``CLEAR`` (see #57)
- (#99) Fix requesting a subset of table rows using using a *numpy array*
  of indices
- Fix expression syntax checker to not reject function calls anymore. Rewrite
  the checker to use a LL(1) parser generator.

Documentation:

- Add links to external documentation using intersphinx
- Use automodapi to create module and class summary pages
- Use type hints from function annotations in documentation
- Mark more objects for inclusion in the documentation
- Add "Edit on GitHub" link to directly edit documentation
- Add many function annotations
- Generate class inheritance diagrams using graphviz
- Document problem with ``Table.row_names()`` after ``USE`` (#93)

Tests/CI:

- Move python 3.5 deprecation warning to package level
- Create GitHub releases for tags automatically (required for zenodo DOIs)
- Setup sequence definitions individually and explicitly in each test
- Add module for regression tests for all future bugfixes
- Put transfer map tests in their own module
- Port tests from unittest to pytest for simplicity
- Mark flaky tests for expected failure on macOS and windows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants