-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconfigure the detectors to use global code for the Binary Serializers #112
Comments
We should agree on a design for this up front before implementing - IMO a good reason to have discussions that get promoted to issues. |
I agree but I already started a POC and needed to commit the code somewhere so I created a branch where I can try things out. This will probably take some time and I really need to try things out in code before coming up with a design. If you prefer I work on a fork, I can do that. |
Fork could work, or, middle of the road. Could be to rename this ticket to something like "spike - experiment with decoupling detector serialization". In my opinion, there's more refactoring to consider beyond serialization for detectors, that could make it easier to implement with a lower touch/surface area. This larger piece could interact with our serialization strategy. I may have a little bit of time opening up to complete the initial python code and work on this with you on a spike branch where we intend to throw away the code and implement the final design we come up with. |
Hi friends. Did a little thinking and have two (hopefully constructive) pieces of feedback on this ticket:
The alternative would be to put the Caveat - as I've leaned into more "functional programming" modalities, I've tended to see the benefit of the segregation of constructed classes from the "helper" methods that allow their creation/manipulation. Think LINQ - all those Select, Where operators etc are defined in separate classes from the collection classes that "hold" the data. Another note: I see that the Lmk what y'all think - I can make any of the above changes if we want.
https://gist.github.com/dcuccia/552c52716211cfecaf48dd84fe254f8b It doesn't change the surface area of the public API nothing in the detector code would change. Could make the factory method consolidation to BinaryArraySerializer a little more palatable, if we were going that direction. I haven't done much testing, but this should be a "good performance" version without any boxing or copying. Let me know what you think. |
@dcuccia thank you for thinking this through! Let me think about your suggestions and I will also take a look at your Gist when I have more time.
Just a quick note on this, we didn't forget it, @hayakawa16 has made a note to add it back in once all the work on the detectors is complete because it is not obsolete until then and it will avoid many warnings on the unchanged detectors while the code is in progress. Good eye noticing that though :) |
Cool, thanks on both accounts! |
I added the Obsolete attribute to DataArray in BinaryArraySerializer and all unit tests run fine. I think that means I have updated all detectors? |
Thanks Carole. I think there are still 53 references to |
Hi, I did a Find on BinaryArraySerializer.DataArray and selected "Entire Solution" and the only one that is shown is in Vts.xml. How are you searching? |
Are you looking at the feature/112-reconfigure-the-detectors-to-use-global-code-for-the-binary-serializers_2 (note "_2" at end) branch? This is the new branch Lisa and I made that picked out only those changes that support your implementation (other trial code was omitted). |
I was looking at _2. Is this not the right branch? |
Yes! How did you find those 53 references? |
VS said there were 53 usages. On my way home and will try to repro there. |
Thanks! Also have you pulled the latest? I've been busy updating detectors this week. |
Oh, that's weird. Might be a Visual Studio cache issue...sorry stand by. |
Wait, now I'm seeing them. Okay, let me work on those detectors. I kind of feel like I modified these detectors already. Well oh well. Deja vu all over again! |
Thanks for the helpful screen shot. I modified all of those files. The only place string "DataArray" shows up now is in Vts.xml and [Obsolete] DataArray in BinaryArraySerializer and BinaryArraySerializerFactory. |
This conversation reminded me that when I was working with the serialization code, it doesn't appear that Dimensions in BinaryArraySerializer is used either. @dcuccia & @hayakawa16 Could you you take a look and see if that is correct or if it is being used elsewhere? |
I found no references to BinaryArraySerializer.Dimensions. Shall I add Obsolete attribute? |
Good call - that is no longer needed, and was new as of this effort. Please delete it. |
@dcuccia it was not new, it's existing, I noticed it wasn't being used though when I first started the work on the detectors. We should probably put the obsolete attribute on it just in case it was being used externally and then remove it when we remove DataArray. |
My mistake. I think I'd added a GetDimensions method and removed it, which is what had confused me. Sounds good. |
Hi Carole, I will have time later to check out the code, but I find this type of code is useful to have as a test-only library helper (it's not relevant for the production code, just the tests. It could be in a Helpers folder in the test library, or if multiple test libraries need access to it, create a Vts.Test.Common or similar library. In terms of location, this would be testing the serialization of that specific class, so I assume we want a 1:1 relationship between the .cs test class and the detector (1 for each). This item unit tests the serialization, another could unit test the accuracy, etc. |
@hayakawa16 when I create helper methods for unit testing, I usually create a separate folder called helpers that doesn't have a matching folder in the project being tested. You could put that in the MonteCarlo folder or at the folder level of the Detectors depending on where it will be used. I also wanted to mention something here about the tests in the VTS. When we were adding test coverage we noticed that the majority of the tests were in fact integration tests and not true unit tests. Where possible, I added unit tests to increase the code coverage but in some case it wasn't possible due to way the original code was written, static classes and methods etc. It would be good once we have the coverage where we want it to be, to identify the areas where we are missing unit tests and add them. This might be a good time to come up with a structure for the test application so we can easily differentiate unit tests and integration tests. In my projects I name the unit test classes the same as the class being unit tested and then I put the integration tests in a separate location under an IntegrationTests folder because they usually include many classes. Any suggestions for this would be welcome. |
Great points Lisa. At Modulim, we go as far as to have separate libraries, e.g. Vts.Test.Unit and Vts.Test.Integration. We run the Unit tests in CI for every branch commit, and run the Integration tests once a (non-draft) PR is created. |
@dcuccia I like that idea! I was going to say separate projects but I didn't know if that was overkill. The fact that you used it at Modulim is good because it sounds like it worked well. |
It does. The other way to do it is to flag tests/test classes as integration or unit or e2e, and use a command line switch to run one or the other. We have some very slow running tests that depend on an outside service that we only run before we build a production release candidate, and we use flags for those. |
I like these ideas. I have more nuts and bolts questions:
rather than
Otherwise the SecondMoment array was not in the serializers list in WriteClearAndReReadArrays in DetectorBinarySerializerHelper. I have pushed code in case you'd like to give it a look. |
Defer to Lisa on 2 & 3. IMO for 1, they should be returning an empty array, not null, and I'd make sure to "move" those tests to the same format as all others (testing serializers is just one part of unit testing the detector class - you'll also want to test correctness, etc). Re: 4, I don't understand - why does SecondMoment need to be allocated, if it's never used? That said, a "fully-initialized" class is always preferable from a code reliability perspective, I just thought we'd kept those null so that we could use all the available memory for the main detector, if that was a limitation. FWIW, if we intend to keep them possibly null, we should mark with the nullable operator (i.e. |
Regarding 4, you're absolutely correct. When I was debugging code and stepping through, SecondMoment wasn't getting set somehow. I reverted code and all is good now. |
Regarding 1, the single valued detectors write their Mean and SecondMoment to the detector.txt ascii file, there is no binary serialization going on. Do you still feel same way? |
Right - that's always been the case. The test, in that case, should just confirm the expectation - that |
I think that's a good place
Yes add them now and we can figure it out later, try to keep integration and unit tests separate even if it is just with a comment in the same file |
I agree with @dcuccia for 1. returning empty array vs null but if this is too much to change now (this is prevalent in our code and needs resolving globally) we can leave it as is and fix later. |
Thanks for your feedback! I will continue along adding unit tests then. |
I am finding I am coding the same code in multiple tests, so I was thinking of optimizing somehow. At first I tried:
but IDetector does not include Mean or SecondMoment so not possible. |
I think this is the unit test for the detector, right? So, my suggestion is a 1:1 relationship between the .cs test class and the detector (1 "XYZDetectorTests.cs" for each class). And in that case, each detector "knows" about it's own things, so you don't need to represent it with an interface, but instead the implementation type. Yes, it's repetitive, but it segregates different detector tests, and will make those files that are "fine" essentially read-only, while new detectors/tests can be written without disturbing the existing ones (more scalable for new work, new devs, etc). We can make more "helper" static libraries, like |
@hayakawa16 Consolidating test code is a great idea, I think I mentioned this last week in our meeting that if we can create generic code for testing using the data we have, we should go that route rather than duplicating the same code in individual tests. If you would like to work together on this, let me know. We can probably create another helper method that can be called from each test to avoid the duplication. |
(and in the case of the 1D, 2D, etc...those should be tests of DetectorIO libraries, with only one example for each "shape" |
To respond to @dcuccia, this is the combined test for ROfAngleDetector and ROfRhoDetector. I think you are suggesting not to do this and that the current DetectorIOTests file be separated into one for each detector? If so would that detector test also test methods like Normalize or just GetBinarySerializers? |
@hayakawa16 all tests relevant to that detector's performance and behavior (and nothing else). |
Got it. Let me discuss with @lmalenfant how we organize these types of single layer tests versus the integration tests. In the meantime I will continue filling out the tests in DetectorIOTests so that I can verify new code, and I can separate later. |
Sounds good! |
I'm half way there with the unit tests! I see that there have been multiple topics brought up in this issue. I have created a list of the order I plan to proceed (this is up for discussion):
|
This looks good @hayakawa16 although I think 2. could be further down the list because this is part of a larger cleanup effort. Plus the detectors that contain this code do not have the serialization code and are not being modified with this change. Since this is already a huge change we need to keep these changes to a minimum because they will be easier to review in the PR. I would choose the option to merge after 3. |
@hayakawa16 would you like me to write the unit tests for BinaryArraySerializerFactory? @dcuccia I was able to look at your Gist finally and I like it, if we write the unit tests for the code as is, we can switch it for the new code and validate that it still works the same way. |
Sounds good! :) |
@lmalenfant that would be super great! |
I have finished updated all 72 detectors with the new GetBinarySerializers code and written unit tests for them all (included testing what results if TallySecondMoment is set to true and false using neat nunit [TestCase(true)] and [TestCase(false)] attributes that Lisa used in her unit tests for BinaryArraySerializerFactoryTests). These tests are currently in a single file. I plan to break each test into a separate file named by the detector it is testing. For example, ROfRhoDetectorTests.cs. The GetBinarySerializers method will be the only test for ROfRhoDetector in this file, however in the future we can add other unit tests to this file for the other methods in ROfRhoDetector class. I plan to create a folder under Vts.Test named "Unit". Then in that folder, a subfolder named "MonteCarlo", then a subfolder in that named "Detectors". Then when I separate the unit tests into the separate files, I will put them in this "Detectors" folder and give the tests the namespace Vts.Test.Unit.MonteCarlo.Detectors. Let me know if you have any comments or questions about this plan. |
@hayakawa16 That sounds good to me. |
The Detectors section of MonteCarlo has a lot of duplicated code and every time we add another detector, the code is duplicated more. The Binary Serializers could be consolidated for the detectors so this would reduce the code duplication.
The text was updated successfully, but these errors were encountered: