Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-existing event introduced when converting pmaps DataFrames to dictionaries #149

Closed
gonzaponte opened this issue Mar 3, 2017 · 30 comments

Comments

@gonzaponte
Copy link
Collaborator

The function cdf_to_dict in pmaps_functions_c.pyx always returns event #0 even when it is not present in the input DataFrame.

@gonzaponte
Copy link
Collaborator Author

I have in my fork (gonzaponte/issue149) a demonstration of the failure. I had to modify the test file so it does not include event #0. Although I have written a new test to demonstrate the bug, all tests fail because of the bug. However, since the new test is more exhaustive, I will rewrite the previous ones to perform as the new one.

@jacg
Copy link
Collaborator

jacg commented Mar 6, 2017

I had to modify the test file so it does not include event #0.

Would it be possible/better to do this by adding another test file? How big a new data file would you need to be able to write this test?

Please do a whitespace cleanup. pmaps_functions_c.pyx, for example, contains a number of spurious whitespace additions.

You have removed the initializations for evt and pk in cdf_to_dict. Is there any good reason for this? A quick search into uninitialized variables in Cython does not inspire confidence. Please initialize these variables, unless there's some good reason not to (which I find hard to imagine).

Please squash into two commits: one adding the tests in their final form, leaving the implementation alone; one implementing the solution, leaving the tests alone.

@jjgomezcadenas Please have a quick look at this, and comment on the how this fits in the grand scheme of things. Also please comment on the desirability of adding a new test data file vs. modifying the existing one.

@gonzaponte
Copy link
Collaborator Author

Would it be possible/better to do this by adding another test file? How big a new data file would you need to be able to write this test?

It is possible, but maybe redundant. It will not be very large (few tens or hundreds of kb).

Please do a whitespace cleanup. pmaps_functions_c.pyx, for example, contains a number of spurious whitespace additions.

Ok.

You have removed the initializations for evt and pk in cdf_to_dict. Is there any good reason for this? A quick search into uninitialized variables in Cython does not inspire confidence. Please initialize these variables, unless there's some good reason not to (which I find hard to imagine).

No good reason, it is just not necessary as they are always assigned when needed. Maybe initializing them to 0 can be misleading, but it is not a big deal. I will set them to 0.

Please squash into two commits: one adding the tests in their final form, leaving the implementation alone; one implementing the solution, leaving the tests alone.

Ok.

@jacg
Copy link
Collaborator

jacg commented Mar 6, 2017

Maybe initializing them to 0 can be misleading, but it is not a big deal. I will set them to 0.

No. My bad. I looked at the code too hastily. You are initializing them before first use, so it's fine as it is. As you say, it might even be better this way.

@jacg jacg mentioned this issue Mar 7, 2017
@gonzaponte gonzaponte mentioned this issue Mar 7, 2017
jacg added a commit to jacg/IC that referenced this issue Mar 8, 2017
It turns out that, beautiful though it was, the `xxx` implementation
based on Pandas `groupby` was two orders of magnitude slower than
`s12df_to_s12l`.

It looks like Pandas `groupby` is fundamentally slow. There may be a
way of achieving the desired effect with `numpy.unique` or something
similar, but, for now, `xxx` has been implemented in Cython,
forgetting any attempt to make the result contain small sections of
the original dataframe.

`xxx` now passes exactly the same tests as `s12df_to_s12l` and the
tests used to drive the development of `xxx` are now also being
applied to `s12df_to_s12l`: all these tests are parametrized with both
implementations.

As far as I can see so far, `xxx` has two advantages over
`s12df_to_s12l` (and it's cython core `cdf_to_dict`):

+ it is legible

+ it is 15% faster

The next step is to see how `xxx` stacks up against next-exp#149, and how it
compares to the proposed solution.

In the long term we want to look at whether we want the dictionaries
which are produced to contain namedtuples or dataframes, rather than
the current tuple of numpy arrays representing the times and energies
in each peak.
@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

This was bugging me, and it became clear that I wasn't going to get to sleep until I had fixed it ...

It turns out that, beautiful though it was, the xxx implementation (discussed in #158) based on Pandas groupby was two orders of magnitude slower than s12df_to_s12l.

It looks like Pandas groupby is fundamentally slow. There may be a way of achieving the desired effect with numpy.unique or something similar, but, for now, xxx has been reimplemented in Cython,
forgetting any attempt to make the result contain small sections of the original dataframe rather than a pair of numpy arrays stuffed into a tuple.

The new xxx now passes exactly the same tests as s12df_to_s12l and the tests used to drive the development of xxx are now also being applied to s12df_to_s12l: all these tests are parametrized with both implementations.

As far as I can see so far, xxx has two advantages over s12df_to_s12l (and it's cython core cdf_to_dict):

  • it is legible

  • it is 15% faster

The next step is to see how xxx stacks up against #149, and how it compares to the proposed solution.

In the long term we want to look at whether we want the dictionaries which are produced to contain namedtuples or dataframes, rather than the current tuple of numpy arrays representing the times and energies in each peak.

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

I have merged (yes! merged, don't worry, it's only temporary: it serves the purpose of keeping track of the alternatives until we decide which one we want) the implementation of xxx and the tests (which now run both on xxx and s12df_to_s12l) into the issue149 branch, and pushed it to my fork.

It seems that xxx passes the issue-149 tests out of the box. @gonzaponte please confirm that I haven't missed something, and that I am actually running the test that points out the bug, and not doing anything else stupid.

  • The proposed bugfix seems to slow down s12df_to_s12l a bit, so the performance advantage of xxx increases from 15% to about 20%.

  • The proposed bugfix shortens and clears up s12df_to_s12l quite a bit, but it still seems rather more cryptic than xxx, but as the author of the latter I would say that, wouldn't I!

Are there any other problems or bugs pending with s12df_to_s12l?

As this is a function which is pivotal and has been problematic, we should make an effort to test it more thoroughly.

@neuslopez Could you please confirm that my timings make sense: I merely ran timeit on the s1 dataframe that comes out of database/test_data/KrMC_pmaps.h5.

Good night.

@gonzaponte
Copy link
Collaborator Author

As far as I can see so far, xxx has two advantages over s12df_to_s12l (and it's cython core cdf_to_dict):

  • it is legible

  • it is 15% faster

Well, I don't find it much more readable. In s12df_to_s12l the workflow is clearer, imho. But of course, this is a matter of interpretation. Concerning performance, s12df_to_s12l contains a function call overload because it is a python function calling the cython one. It can by written in the same fashion as yours (pure cython function). I guess this structure comes because we want a pure python interface, @jjgomezcadenas?
In any case, I don't think performance is critical for this function, which will only be executed once per file.

It seems that xxx passes the issue-149 tests out of the box. @gonzaponte please confirm that I haven't missed something, and that I am actually running the test that points out the bug, and not doing anything else stupid.

I think it is ok!

Are there any other problems or bugs pending with s12df_to_s12l?

I don't think there are more bugs, but we can extend testing to include those that came up in issue #146.

@neuslopez
Copy link
Collaborator

Using branch issue149, see bellow the new timings using the new xxx (same Kr file):

  1. pf.s12df_to_s12l(S1df,10000): 10 loops, best of 3: 58.2 ms per loop
  2. cpm.xxx (S1df,10000): 10 loops, best of 3: 41.3 ms per loop

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

Concerning performance, s12df_to_s12l contains a function call overload because it is a python function calling the cython one.

It does, but this experiment

python -m timeit -s 'f = lambda:1' 'f()'
10000000 loops, best of 3: 0.0858 usec per loop

python -m timeit -s 'f = lambda:1' 'f'  
100000000 loops, best of 3: 0.0142 usec per loop

shows that an pure-Python function call can be performed in 86 nanoseconds, while the lookup of the function's name takes 14, leaving no more than about 70 nanoeconds for the overhead of calling a pure-Python function. This tiny O(1) overhead cannot explain the (O(n)-ish) differences of 30 microseconds that I have been observing for small files, and the 17 milliseconds that @neuslopez reports above for larger files.

It can by written in the same fashion as yours (pure cython function). I guess this structure comes because we want a pure python interface

In this case I cannot see any advantages offered by the pure python interface wrapper. It just needlessly complicates and obfuscates the issue. I'm always happy to be educated, though.

In any case, the current version of cdf_to_dict is much better (even if it is a little slower) than the old one. The old version was written in write-only style, and I admire your courage and determination in having fixed the bug in it. I was not up to the task, without rewriting it from scratch.

Tasks remaining:

  • Decide whether we prefer cdf_to_dict + d12df_to_s12l or xxx.
  • If we choose the former, combine it into a single Cython function.
  • Clean up and expand the tests, including any that emerge from Wrong type in last evt in s12df_to_s12l #146.

Once that is done we could merge. Or we could discuss whether we really want dictionaries containing, dictionaries, containing tuples, containing arrays; or whether we would like to replace the arrays-in-tuples with dataframes. But I think that the latter would be better discussed in a separate issue/PR. Some observations though:

  • After my experience with writing xxx, I am less keen on the dataframes: indexing them is very slow! But perhaps this doesn't matter in our usage?
  • This issue probably needs to be discussed with reference to Modifying the format of S2Si table #157.

Oh, and for cross-referencing purposes, this issue is addressed in PR #162.

(GitHub used to have the capability to turn issues into PRs. They got rid of it. Shame, as it seems very useful.)

@gonzaponte
Copy link
Collaborator Author

In this case I cannot see any advantages offered by the pure python interface wrapper. It just needlessly complicates and obfuscates the issue

The only "advantage" I see is that gathering all functions in a single file (pmaps_functions.py) allows you to forget the different implementations (either python or cython). For example in your fork you have:

import invisible_cities.reco.pmaps_functions   as pmapf
import invisible_cities.reco.pmaps_functions_c as pmapf_c
pmaps_read_parm = ('df_to_dict',
                   (pmapf.s12df_to_s12l,
                    pmapf_c.xxx))

The following may be nicer to look at:

import invisible_cities.reco.pmaps_functions   as pmapf
pmaps_read_parm = ('df_to_dict',
                   (pmapf.s12df_to_s12l,
                    pmapf.xxx))

This can still be addressed by creating aliases or using a from statement in the python module.

Once that is done we could merge. Or we could discuss whether we really want dictionaries containing, dictionaries, containing tuples, containing arrays; or whether we would like to replace the arrays-in-tuples with dataframes. But I think that the latter would be better discussed in a separate issue/PR. Some observations though:

  • After my experience with writing xxx, I am less keen on the dataframes: indexing them is very slow! But perhaps this doesn't matter in our usage?
  • This issue probably needs to be discussed with reference to Modifying the format of S2Si table #157.

Agreed. JJ wants to use DFs bacause of the statistical and correlation properties they provide. After using the current tuple-of-arrays structure in a small analysis I don't see any advantage in using the DFs functionalities, but my imagination is limited and future analysis may need them.

@jjgomezcadenas
Copy link
Collaborator

jjgomezcadenas commented Mar 8, 2017 via email

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

The only "advantage" I see is that gathering all functions in a single file (pmaps_functions.py) allows you to forget the different implementations (either python or cython).

Totally agreed. As far as I am concerned, all the Cython modules are low-level implementations of the Python modules, and I would like not to see them being imported directly in client code at all. (IIRC, @jjgomezcadenas disagrees). But the solution to this is not to implement pure-Python wrappers (containing arbitrary subsets of the functionality, like in the case of s12df_to_s12l (I have grown to hate that name, BTW)). No, one should simply import the Cython implementation into the pure python module. In this case, inside pmaps_functions.py we put

from invisible_cities.reco.pmaps_functions_c import xxx as s12df_to_12l

and we're done. No need to write a wrapper around it. And indeed, you seem to acknowledge this in your comment.

JJ wants to use DFs bacause of the statistical and correlation properties they provide.

Seems jolly sensible to me. As long as the huge indexing overhead of dataframes doesn't prevent us from doing other things in a reasonable amount of time.

the current tuple-of-arrays

I really would like those to be named tuples, though. I'll implement that tomorrow morning, unless somene beats me to it. It should take all of 30 seconds. Hmm, I might just do it now ... OK, I've pushed it to my fork on the issue149 branch. (No test for accessing time and energy by name yet, though!)

@jjgomezcadenas
Copy link
Collaborator

jjgomezcadenas commented Mar 8, 2017 via email

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

Was it really necessary that @Jacq and @neuslopez would write yet another implementation?

  • Given that the original was write-only: yes.

  • Given that the brokenness of this function was getting in our way in everything we wanted to do, and re-implementing a new one was easier than understanding and fixing the origial: yes.

  • Given that it was the quickest way to understand what cdf_to_dict was supposed to do: yes.

  • Given that it demonstrates in 4 clear lines what 49 convoluted lines were failing to do correctly: yes.

  • Given that it increased the bus number of the feature considerably: yes.

being called names, eg art guy

Please don't confuse statements about specific things you say, with calling you names. (We've been here before, haven't we?). It is patently clear that you are not an ARTist, but the argument you used on that occasion was very ARTistic. I make no apology for pointing this out, and I would be grateful if you return the favour, if the situation is ever reversed.

did not reinvent the wheel but turned out to be two orders of magnitude slower

While it was two orders of magnitude slower, it did have three rather important merits over the original.

  • it was correct.

  • it was obviously correct ... not to mention tested.

  • it explained what the faulty one was attempting to do.

The process demonstrated a clear performance problem with Pandas dataframes of which I, for one, was not previously aware. Oh, and it took less time to implement than these meta-discussions.

Then, a new re-reimp resurrected the wheel in cython code.

Yes, the ideas in the high-level imp were expressed at a lower level, and passed all tests at first go, beating all other existing implementations for speed.

can we solve this? By all means get rid of my faulty, clumsy and artistic imp.

That goes without saying: it contains bugs which neither of its successors do.

can we agree in some scheme to share the work that does not prompt us to three reimps of the sane function in the future?

I submit to you that if you were pera-programming with us when we came up with xxx, you would have been an enthusiastic supporter of it. It took little time, and it was a very edifying experience.

I believe that we can do better than spinning around the reimp of a single function!

This meta-discussion is taking more effort that the re-imp. The delays caused by the bugs in the original, took up much more than resources than the re-imp. The re-imp increased the bus number of the function in question by two.

If your point is that we should talk less and do more, I agree. If your point is that we should not write code when trying to understand problematic code, then I respectfully and vehemently disagree.

Sorry about top posting writing from iphone in the train Enviado desde mi iPhone

No worries, GitHub hid the noise.

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

i like pure pyhon hiding cython imp basically for the same reasons offered by @gonzaponte

Stripped to the bare bones, we are talking about these two alternatives:

  • Option 1

    Inside python_implementation.py you write

     import cython_implementation
    
     def feature():
         return cython_implementation.feature()
    
  • Option 2

    Inside python_implementation.py you write

      from cython_implementation import feature
    

Agreed?

@jjgomezcadenas
Copy link
Collaborator

jjgomezcadenas commented Mar 8, 2017 via email

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

Good. Now, to me, ceteris paribus, one of those options is clearly better than the other.

Agreed?

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

Using branch issue149, see bellow the new timings using the new xxx (same Kr file):

pf.s12df_to_s12l(S1df,10000): 10 loops, best of 3: 58.2 ms per loop
cpm.xxx (S1df,10000): 10 loops, best of 3: 41.3 ms per loop

@neuslopez Could you please give us some idea of how long it takes read_pmaps to get these data from disk into memory?

@jjgomezcadenas
Copy link
Collaborator

jjgomezcadenas commented Mar 8, 2017 via email

@jjgomezcadenas
Copy link
Collaborator

jjgomezcadenas commented Mar 8, 2017 via email

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

Given that some one else (e.g, Gonzalo) was already looking at the problem and he was not around. Debatable, I think.

The bus number of this thing is now 4. Without xxx it would have been 2.

Post xxx I am confident that my opinion on the PR is worth hearing. Before xxx there is no way I could have formed a sensible opinion (in a any amount of time that I would have been prepared to invest in forming it).

Oh, well, I will manage to recover from current low level of Karma, don’t worry.

Your low level of Karma is but an illusion. Your Karma remains high.

Incidentally the reason I wrote the write-only, faulty, nasty, ugly cython imp was because I had found long ago that Pandas DF were slow. Alas, I failed to make that part of the public lore. Now is clear.

  • The personal journey of discovery was worth far more to me than lore.

  • Just because Pandas is slow, doesn't mean that the only alternative is writing write-only, buggy code!

My point is simple: There is life (I hope) beyond XXX. Please, PR XXX and friends and let’s be done with it.

That's what I was proposing before this meta-discussion got started. Repeating the task list I made at that point ...

Tasks remaining:

  • Decide whether we prefer cdf_to_dict + d12df_to_s12l or xxx.
  • If we choose the former, combine it into a single Cython function (i.e. option 2, above).
  • Clean up and expand the tests, including any that emerge from Wrong type in last evt in s12df_to_s12l #146.

@jjgomezcadenas
Copy link
Collaborator

jjgomezcadenas commented Mar 8, 2017 via email

@gonzaponte
Copy link
Collaborator Author

If we go for xxx we skip that step. What is your opinion, @gonzaponte?

To me, cdf_to_dict is easier to understand, but I have devoted more time to this function than to xxx. Brais says he likes xxx more and since he is not biased, I would stick to the latter. Let's move on!

@gonzaponte
Copy link
Collaborator Author

I've just realized that the xxx function assumes the ordering of the input data frame. It is not a problem now, but it must be taken into account and should be written in the function documentation. This probably means that the PMapWriter function/class/whatever should have a test that ensures the output file is written in the correct order.

@jacg
Copy link
Collaborator

jacg commented Mar 8, 2017

OK, I'll clean up the branch and submit before I retire for the night. Failing that, first thing tomorrow morning.

The only thing I don’t like about it is that I wasn’t around eating some peras and increasing my Karma for free. I suggest that we use xxx and my previous version in our docs to show the difference between read/write and write-only code.

I think that last sentence is seriously raising your karma.

@Jacq
Copy link

Jacq commented Mar 8, 2017

Good to know.... I have no idea what is this project about and suddenly there are 15 notifications of this discussion in my inbox :D
I think that at some point @jacg mutated to @Jacq at the party started.
Cheers

@jacg
Copy link
Collaborator

jacg commented Mar 9, 2017

I've just realized that the xxx function assumes the ordering of the input data frame.

Well, yes, that did seem to be a fundamental feature of the data I had to go on. But I suppose you are right: it is an assumption which could be violated by the pmaps writer.

While cdf_to_dict is a rather more robust in this respect than xxx, it will fail silently if an event limit is required and some portion of an event with id number greater than the limit appears before all lower-numbered events have been completed.

FWIW, here is my take on cdf_to_dict, imaginatively called yyy. It passes all the same tests and seems to be a little faster than s12df_to_s12l but slower than xxx. (Remember: we are well within a factor of two between slowest and fastest!)

cpdef yyy(df, max_events=None):

    cdef dict all_events = {}
    cdef dict current_event
    cdef tuple current_peak

    cdef int   [:] event = df.event.values
    cdef char  [:] peak  = df.peak .values
    cdef float [:] time  = df.time .values
    cdef float [:] ene   = df.ene  .values

    cdef int df_size = len(df.index)
    cdef int i
    cdef long limit = np.iinfo(int).max if max_events is None or max_events < 0 else max_events
    cdef int peak_number
    cdef list t, E

    for i in range(df_size):

        if event[i] >= limit: break

        current_event = all_events   .setdefault(event[i], {}      )
        current_peak  = current_event.setdefault( peak[i], ([], []))
        current_peak[0].append(time[i])
        current_peak[1].append( ene[i])

    # Postprocess: Turn lists into numpy arrays before returning
    for current_event in all_events.values():
        for peak_number, (t, E) in current_event.items():
            current_event[peak_number] = (np.array(t), np.array(E))

    return all_events

Perhaps we prefer this one to xxx?

I will prepare the PR with xxx in place, but it will be trivial to change

One more thing: Are we attached to the name s12df_to_s1l? I have grown to hate it. The l at the end is especially irksome.

And another thing: do I have your permission to replace the tuple with a namedtuple? (I have the implementation ready and tested. It slightly degrades the performance, but I'm confident that this is completely irrelevant.)

@jjgomezcadenas
Copy link
Collaborator

My five cents:

  1. I prefer yyy to xxx. It is very concise and elegant, and the time difference is not all that relevant. Since you don't have the PR yet submitted, why don't you go ahead and use yyy instead of xxx.
  2. I suggest df_to_pmaps_dict as the new name.
  3. I am all in favour of a named tuple.

@gonzaponte
Copy link
Collaborator Author

What a wonderfull method setdefult is. I have been doing stupid things so far because I didn't know that method (and I thought I knew them all)!

Perhaps we prefer this one to xxx?

Fine with me.

  1. I suggest df_to_pmaps_dict as the new name

Since it is already in pmaps_functions.py, I would avoid naming it pmaps (simply df_to_dict?).

  1. I am all in favour of a named tuple.

Me too.

jacg added a commit to gonzaponte/IC that referenced this issue Mar 9, 2017
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jacg added a commit to gonzaponte/IC that referenced this issue Mar 9, 2017
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jacg added a commit to gonzaponte/IC that referenced this issue Mar 9, 2017
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
@jacg
Copy link
Collaborator

jacg commented Mar 9, 2017

What a wonderfull method setdefult is.

Even better, for this purpose, is collections.defaultdict, but Cython doesn't know how to type that one, so I stuck with the basic dict.

@jacg jacg closed this as completed in 265a297 Mar 9, 2017
jacg added a commit that referenced this issue Sep 9, 2017
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jacg added a commit that referenced this issue Sep 10, 2017
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jacg added a commit that referenced this issue Sep 15, 2017
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jmbenlloch pushed a commit to jmbenlloch/IC-1 that referenced this issue Apr 24, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jacg added a commit to jacg/IC that referenced this issue Sep 14, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jmbenlloch pushed a commit to jmbenlloch/IC-1 that referenced this issue Oct 16, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jmbenlloch pushed a commit to jmbenlloch/IC-1 that referenced this issue Oct 16, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jmbenlloch pushed a commit to jmbenlloch/IC-1 that referenced this issue Oct 16, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jmbenlloch pushed a commit that referenced this issue Oct 16, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
jmbenlloch pushed a commit to jmbenlloch/IC-1 that referenced this issue Oct 16, 2018
The write-only, buggy function `s12df_to_s12l` (the bulk of whose
implementation was in the Cython function `cdf_to_dict`) has been
replaced with a single Cython function `df_to_pmaps_dict`.

An alterative implementation replaces this one in the next commit.

Note: if the order of events in the input dataframe is not
monotonically increasing *and* an event number limit is set, silent
errors are likely to occur.

Extensive tests for the feature have been added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants