Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test core_def_all_beads added. [Issue #27] #31

Merged
merged 11 commits into from
Jun 25, 2021
Merged

test core_def_all_beads added. [Issue #27] #31

merged 11 commits into from
Jun 25, 2021

Conversation

ojeda-e
Copy link
Member

@ojeda-e ojeda-e commented Jun 19, 2021

Fixes issue #27

Changes:

  • test_def_all_beads added.
  • added import def_all_beads in core.py

This PR helps to solve #17

@ojeda-e ojeda-e changed the title test core_def_all_bedas added. test core_def_all_bedas added. [Issue #27] Jun 19, 2021
@codecov
Copy link

codecov bot commented Jun 19, 2021

Codecov Report

Merging #31 (59bc610) into main (bf6019b) will increase coverage by 13.65%.
The diff coverage is 100.00%.

@ojeda-e ojeda-e requested a review from lilyminium June 19, 2021 00:54
@ojeda-e ojeda-e linked an issue Jun 19, 2021 that may be closed by this pull request
Copy link
Member

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ojeda-e for adding this test. I assume it's a regression test to test former behaviour. Could you please add a smaller example that's more easily human-verifiable (e.g. 10 lipids in each leaflet)? Looking at this test I have no idea what def_all_beads is for, nor how the head_list is meant to work -- it looks as though you've only told the function that 3 lipid indices belong to different leaflets, and yet I see that there are many beads in the MEMBRANE_CURVATURE_DATA reference. What happens if your beads are all mixed up, like lipid 1 and 3 are in the upper leaflet and lipid 2 is in the lower one? If your code doesn't currently allow for that, it's fine for the test to fail (mark it with pytest.mark.xfail and give a reason).

@@ -18,6 +18,7 @@

sys.path.append('lib/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you see if you can remove this line? It shouldn't be necessary; if something breaks, you may need to from .lib import my_func before calling my_func.

@@ -5,13 +5,15 @@
# Import package, test suite, and other packages as needed
import pickle

from numpy.testing._private.utils import assert_array_equal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general you should avoid using private paths (typically prefixed with an underscore), as they may change in the future. Can you `from numpy.testing import assert_array_equal)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought I added that change already in a previous PR. That change wasn't saved.

topology = md.load(GRO_MEMBRANE_PROTEIN).topology
beads_test = def_all_beads(lipid_types, lfs, head_list, topology)
for lt in lipid_types:
for bead, bead_t in zip(MEMBRANE_CURVATURE_DATA['beads']["upper"][lt], beads_test["upper"][lt]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the lower beads?

Copy link
Member Author

@ojeda-e ojeda-e Jun 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a smaller example that's more easily human-verifiable (e.g. 10 lipids in each leaflet)?

Should I add a new test (smaller number of beads), or is it better to replace the current one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new test in addition to this would be the best of both worlds ^_^

beads_test = def_all_beads(lipid_types, lfs, head_list, topology)
for lt in lipid_types:
for bead, bead_t in zip(MEMBRANE_CURVATURE_DATA['beads']["upper"][lt], beads_test["upper"][lt]):
assert_almost_equal(int(bead), int(bead_t))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is an integer, this should be fully equal instead of almost equal (which is usually used for floats). You should also know that both values are already integers, right? Could you assert bead == bead_t?

@ojeda-e
Copy link
Member Author

ojeda-e commented Jun 20, 2021

Thanks for the review @lilyminium. Here some answers (and a question) :)

it looks as though you've only told the function that 3 lipid indices belong to different leaflets, and yet I see that there are many beads in the MEMBRANE_CURVATURE_DATA reference.

To clarify, the indexes for leaflets were initially provided by the user with parse_args, -ii, and -io. For the test, the associated grofile has the indexes
outer leaflet: range(4286, 11531)
inner leaflet: range(11531, 18891)

Edit: so the def_all_beads function works using only three indexes, First index of upper leaflet, last index of upper leaflet, and last index of upper leaflet.
It would fail if the indexes of lower and upper leaflet are not consecutive. (i.e. the last index of upper leaflet is not inmediately before the first bead of the lower leaflet.) I haven't seen a grofile that splits indexes of beads like that, but who knows.

What happens if your beads are all mixed up, like lipid 1 and 3 are in the upper leaflet and lipid 2 is in the lower one? If your code doesn't currently allow for that, it's fine for the test to fail (mark it with pytest.mark.xfail and give a reason).

It will probably assign the wrong indexes to each leaflet. It'll fail for sure. But now a question: in the structure of a grofile, the lipids are always "organized", aren't they? I mean. The way indexes are assigned to elements (atoms or beads) in a grofile is always strictly increasing.
If that's true, even in cases where flip-flop occurs (trajectory), the grofile won't mix up the indexes per leaflet. Please correct me if I am wrong.

@ojeda-e ojeda-e changed the title test core_def_all_bedas added. [Issue #27] test core_def_all_beads added. [Issue #27] Jun 20, 2021
@lilyminium
Copy link
Member

lilyminium commented Jun 20, 2021

It will probably assign the wrong indexes to each leaflet. It'll fail for sure. But now a question: in the structure of a grofile, the lipids are always "organized", aren't they? I mean. The way indexes are assigned to elements (atoms or beads) in a grofile is always strictly increasing.

It's increasing, but only in terms of "this is the 58th lipid, this is the 59th lipid". It's pretty arbitrary where the lipid is in 3D space and does not correspond with the order in the gro file. The reason most membranes are divided by leaflet in the grofile is because that's how most humans think, and mostly humans put the grofile together (edit: or write the scripts that do so).

If that's true, even in cases where flip-flop occurs (trajectory), the grofile won't mix up the indexes per leaflet. Please correct me if I am wrong.

The order of the grofile also remains static; only the positions change. That means that even if cholesterol flips to the other leaflet, the index won't change. So you could have lower-leaflet cholesterol mixed in with the other upper leaflet lipids.

@ojeda-e
Copy link
Member Author

ojeda-e commented Jun 20, 2021

The order of the grofile also remains static; only the positions change. That means that even if cholesterol flips to the other leaflet, the index won't change. So you could have lower-leaflet cholesterol mixed in with the other upper leaflet lipids.

Sorry I am not communicating correctly. The code reads only once the grofile, before iterating any frame. def_all_beads is not called in every frame, otherwise, it would be very expensive to run and would be affected by the 3D instant coordinates it adopts for every dt. The grofile is static, and for the same reason by reading it once, before iterating over any frame, indexes are ordered. Hence, if flips occur during the simulation, that won't change the indexes assign per leaflet.

@pep8speaks
Copy link

pep8speaks commented Jun 20, 2021

Hello @ojeda-e! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 11:120: E501 line too long (126 > 119 characters)

Comment last updated at 2021-06-24 05:23:09 UTC

16800, 16812, 16824, 16836, 16848, 16860, 16872, 16884, 16896, 16908, 16920,
16932, 16944, 16956, 16968, 16980, 16992, 17004, 17016, 17028, 17040]}},

'small': {'lower': {'POPC': [5, 6, 7], 'POPE': [8, 9]}, 'upper': {'POPC': [0, 1, 4], 'POPE': [2, 3]}},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upper POPC molecules of [0, 1, 4] have the coordinates:

1923POPC PO4 1 1.361 0.630 16.101
2054POPC PO4 2 0.704 2.229 15.981
2622POPC PO4 5 1.451 0.382 11.752

That last POPC looks much more similar to the lower POPC with the coordinates:

2702POPC PO4 6 0.081 0.610 11.835
2708POPC PO4 7 0.678 2.233 11.825
2713POPC PO4 8 0.988 0.554 11.838

Are you sure this is the right answer?

Comment on lines 255 to 258
'med': {'lower': {'POPC': [12, 13, 14, 15, 16],
'POPE': [17, 18, 19, 20, 21, 22, 23, 24]},
'upper': {'POPC': [0, 1, 2, 3, 4, 5, 6, 7, 11],
'POPE': [8, 9, 10]}},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The z-positions of the upper POPC are: [16.101, 16.251, 16.095, 15.981, 16.518, 16.049, 15.95 , 16.268, 11.752]. That last POPC also seems quite low.

I haven't checked this for the big system but could you please check that the results you're getting are correct?

}


@pytest.fixture()
@ pytest.fixture()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's normally conventional to put a space after the @?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I just noticed it's the pep8 plugin (in VS) adds those spaces. Deleted in the last commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might think (wrongly) that @ is the binary matrix multiplication operator??

In MDAnalysis we typically look how we wrote existing code and then try to match the existing code, unless it appears to go against our style guide

beads_test = def_all_beads(lipid_types, lfs, head_list, topology)
for lt, lf in zip(lipid_types, lfs):
for bead, bead_t in zip(MEMBRANE_CURVATURE_DATA['membrane_protein'][lf][lt], beads_test[lf][lt]):
assert int(bead) == int(bead_t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need these int() calls? Aren't they already integers?

@lilyminium
Copy link
Member

In addition, instead of using selections from the GRO file, I was hoping you could drum up some fake coordinates as outlined in #30 (review) so that you could use the same system for checking the x-y gridding as well.

@lilyminium
Copy link
Member

I think my comment on this got lost in the review -- thanks for adding the smaller files, they made it easier to go through and verify that we are or aren't getting the results we want :)

@ojeda-e
Copy link
Member Author

ojeda-e commented Jun 24, 2021

Thanks for your review, @lilyminium
I think at this point is better to reduce the test to only one small system, which is here included as test_po4_small.gro and its respective trajectory file test_po4_small.xtc which includes 2 frames. The simulation box has dimensions of x_dim, y_dim = 3, 3. Therefore, the grid will be considered for this test as a 3x3 grid.
The medium and big systems were temporarily removed.

Given the functions I should test, both gro and xtc files are needed. Fixing issue #32 will allow testing functions with dummy coordinates instead of grofiles.

In what respect to this PR, the small gro file provided includes 9 beads, all of them POPC lipid type, to simplify. Also to simplify, all beads have the same z value: z=15.000 in the upper leaflet, and z=12.000 in the lower leaflet. Since I am interested in checking z_avg and the mapping, realistic values are not relevant at this point.

There are 4 beads in the upper leaflet, beads [(0) to (3)], and 5 beads in the lower leaflet, beads [(4) to (8)]. Or as provided, as in the pytest.parametrized, it can be read as: ref_beads = {'small': {'upper': {'POPC': [0, 1, 2, 3]}, 'lower': {'POPC': [4, 5, 6, 7, 8]}}}

The 9 lipids are distributed as shown in the figure below

Frame 1

^ o ___ o ___ o __ |
| |     |     |    |
| (6)__(7)___(8)___|
y |     |     |    |
| (3)__(4)___(5)___|
| |     |     |    |
v (0)__(1)___(2)___|
  0.    1.    2.   3.
  <------ x ------->

Frame 2

^  o ___ o ___ o ___ |
|  | (6) | (7) | (8) |
|  o ___ o ___ o ___ |
y  | (3) | (4) | (5) |
|  o ___ o ___ o ___ |
|  | (0) | (1) | (2) |
v  o ___ o ___ o ___ |
  0.    1.    2.   3.
   <------- x ------->

Each bead moves 0.5 in both (x,y) directions, between frame 1 and 2.
Each bead is mapped to its respective unit cell, according to its (x,y) coordinates.

==== In upper leaflet:
Frame No. 0
Bead 0 mapped to: [ 0 0 ] with coords: 0.0 0.0 15.0
Bead 1 mapped to: [ 1 0 ] with coords: 1.0 0.0 15.0
Bead 2 mapped to: [ 2 0 ] with coords: 2.0 0.0 15.0
Bead 3 mapped to: [ 0 1 ] with coords: 0.0 1.0 15.0
Frame No. 1
Bead 0 mapped to: [ 0 0 ] with coords: 0.5 0.5 15.0
Bead 1 mapped to: [ 1 0 ] with coords: 1.5 0.5 15.0
Bead 2 mapped to: [ 2 0 ] with coords: 2.5 0.5 15.0
Bead 3 mapped to: [ 0 1 ] with coords: 0.5 1.5 15.0

==== In lower leaflet:
Frame No. 0
Bead 4 mapped to: [ 1 1 ] with coords: 1.0 1.0 12.0
Bead 5 mapped to: [ 2 1 ] with coords: 2.0 1.0 12.0
Bead 6 mapped to: [ 0 2 ] with coords: 0.0 2.0 12.0
Bead 7 mapped to: [ 1 2 ] with coords: 1.0 2.0 12.0
Bead 8 mapped to: [ 2 2 ] with coords: 2.0 2.0 12.0
Frame No. 1
Bead 4 mapped to: [ 1 1 ] with coords: 1.5 1.5 12.0
Bead 5 mapped to: [ 2 1 ] with coords: 2.5 1.5 12.0
Bead 6 mapped to: [ 0 2 ] with coords: 0.5 2.5 12.0
Bead 7 mapped to: [ 1 2 ] with coords: 1.5 2.5 12.0
Bead 8 mapped to: [ 2 2 ] with coords: 2.5 2.5 12.0

As a result, the first grid will be populated

Frame 1

  • Upper Leaflet
o ______ o _____ o _______ |
| np.nan | np.nan | np.nan |
o ______ o _____ o _______ |
|   (3)  | np.nan | np.nan |
o _______o ______ o ______ |
|   (0)  |   (1)  |   (2)  |
o ______ o ______ o ______ |

-Lower Leaflet

o ______ o _____ o _______ |
|   (6)  |  (7)   |   (8)  |
o ______ o _____ o _______ |
| np.nan |  (5)   |   (6)  |
o _______o ______ o ______ |
| np.nan | np.nan | np.nan |
o ______ o ______ o ______ |

The diagrams above are equivalent to

z = {'upper': array([[15., 15., nan], [15., nan, nan], [15., nan, nan]]), 
     'lower': array([[nan, nan, 12.], [nan, 12., 12.], [nan, 12., 12.]])}

With these values from the small system, I added tests for the functions core_fast, core_fast_leaflet, and test_all_beads in mods.py.
Although I could have parametrized fixtures, I decided not to because this test will dramatically change once MDtraj is replaced by MDAnalysis, which is the next step.
I hope these tests are simpler to follow, then I can focus on refactoring.
Thanks.

@ojeda-e ojeda-e requested a review from lilyminium June 24, 2021 05:30
@ojeda-e
Copy link
Member Author

ojeda-e commented Jun 24, 2021

@IAlibay, maybe before creating universes from arrays and write and re-load, could you please take a look here?

Copy link
Member

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes @ojeda-e! I have some small comments but I'll approve because they are quite small.

'resname ' + lt + ' and index ' + str(head_list[1] + 1) + ' to ' + str(head_list[2]) + ' and name GM1'))).astype(int).tolist()

dic_all_beads['upper'][lt] = topology.select(
'resname ' + lt + ' and index ' + str(head_list[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know about f-strings, @ojeda-e? It might be more readable to do this?

f"resname {lt} and index {head_list[0]} to {head_list[1]} and name PO4"

As a side note, maybe you shouldn't hard-code the name of the bead / atom in in the refactor :)

Comment on lines +271 to +272
def leaflets():
return ["lower", "upper"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't wrong, but you could also just have the line below somewhere high up in the file, or pass in the leaflet labels with pytest.mark.parametrize.

LEAFLETS = ["lower", "upper"]

Fixtures are initialized and destroyed for each call of a test function (depending on the scope of the fixture) in a clean way to make sure that the input is always the same. That's not really necessary for a list of strings.

@pytest.fixture()
def topol():
top = {'small': md.load(GRO_PO4_SMALL).topology,
'all': md.load(GRO_PO4).topology}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This on the other hand is a good use of a fixture. If one test modifies the GRO_PO4 MDTraj object, we don't want that to affect another test.

@ojeda-e
Copy link
Member Author

ojeda-e commented Jun 25, 2021

Thanks for the review, @lilyminium .
I agree with your comments. I will merge this as it is and will adapt them in the refactoring PR #34.
Thanks.

@ojeda-e ojeda-e merged commit e60ce22 into main Jun 25, 2021
@ojeda-e ojeda-e deleted the issue27 branch August 29, 2021 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write test def_all_beads
4 participants