Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle encoding errors #1200

Closed
mrbean-bremen opened this issue Aug 9, 2021 · 13 comments
Closed

Handle encoding errors #1200

mrbean-bremen opened this issue Aug 9, 2021 · 13 comments

Comments

@mrbean-bremen
Copy link
Collaborator

mrbean-bremen commented Aug 9, 2021

This is a spin-off of PR #1198 and may be done after that is finished.

There are a few cases of incorrect Specific Character Set contents that are currently not handled:

  • cases of spacs/underscore mixup are already handled, extend this to dashes that sometimes are used instead of underscores or spaces
  • invalid encodings are currently silently ignored and fallback to ASCII - I think it would make sense to issue a warning in the log file in this case; the same is true for invalid escape sequences
  • normal encodings are used instead of extended encodings: not sure if we need to warn here, as the intention is probably clear
  • stand-alone encodings like UTF-8 are used as extended encoding - these should be ignored, and a warning makes sense
  • additional encodings with stand-alone encodings (e.g. more than one encoding if the first encoding is a stand-alone encoding): the same, additional encoding shall be ignored, and a warning issued

Somewhat related are decoding errors, e.g. text values that cannot be decoded in the given encoding. Currently, these are silently ignored, and replacement characters are used. We may both issue a warning and use replacement characters here - to be discussed (I'm not sure about the policy with respect to incorrect DICOM in fo-dicom).

@gofal
Copy link
Contributor

gofal commented Aug 10, 2021

the policy with incorrect DICOM is: when generating DICOM data then be as strict as possible, when reading/consuming DICOM data then be as tollerant as possible.
So if the developer create a new Dataset, sets the encoding to ASCII and then adds a text that contains characters that are not encodable in ASCII, then go ahead and throw an exception.
But if there is a storescp running and a dicomdataset is arriving via network stream with encoding ASCII but some binary data in a string-based-Tag, that cannot be decoded, then we should still process the file and add replacement characters. A warning or error in log is a very good idea here!

error correction in reading DICOM data is a difficult topic. if there are some invalid data but the error is obvious and unambiguous to correct (using spaces or underscores mixed up in values of SpecificCharacterSet), then it is fine, if fo-dicom does this implicit. If there are some other invalid encodings or others, then it is hard to guess the right encoding and fix it. Thats why the default-encoding is used then (ASCII) and replacement characters are used for the unknown characters. In the end this makes it visible to the user that something went wrong with the data. If fo-dicom would somehow guess the encoding and sadly guesses the wrong encoding, the we would generate some wrong data. But logging a warn in this fallback-scenario is a very good idea, of course.

@mrbean-bremen
Copy link
Collaborator Author

when generating DICOM data then be as strict as possible, when reading/consuming DICOM data then be as tollerant as possible.

Yes, that's always a good policy. I think the usage of replacement characters is also what is expected, so what can be added are the respective warnings in the log for invalid encodings, or invalid text for a given encoding.

@mrbean-bremen
Copy link
Collaborator Author

I checked how logging is done, and I'm not sure yet what is the correct way to log in a library function. Is there a possibility to use the registered logger(s) for logging, or do we need to pass the logger from the caller (this seems to be done in a few cases)?
Maybe register a null logger (that does nothing) by default, and make it possible for executables to register their own logger instead, that would be used in this case... what do you think?

@gofal
Copy link
Contributor

gofal commented Aug 10, 2021

There is the interface FellowOakDicom.Log.ILogManager with some default implementation NullLoggerManager and some other implementations like ConsoleLogManager, TextWriterLogManager or NLogManager.
The logger is registered in ServiceProvider. The instances of DicomClient or DicomServer receive the ILogManager instance from ServiceProvider. Then they pass their instance down into the methods. This is because of scope. The developer in the executable may have an implementation where the DicomClient or the DicomAssociation get a new instance of the logger and hold some contextual information (like associationnumber, AETitles etc). If the underlying methods and classes then would create their own instances of logger from ServiceProvider, then you would loose the context in which association this log entry has happened. Thats why it's mixed. Higher level classes that may represend some kind of context get a new instance from Service provider, and lower level classes then get those loggers passed via parameter.

@mrbean-bremen
Copy link
Collaborator Author

mrbean-bremen commented Aug 11, 2021

Thank you, that makes sense!
So, as far as I understand, it is not a good idea to get a logger instance say, in the static DicomEncoding constructor, via Setup.ServiceProvider.GetRequiredService<ILogManager>.... Instead, the logger shall be passed down from the caller in the called methods, and as the caller does not have a logger either, from further up. Is that correct? I kind of understand in this case, why no logging is done so far at that level...

@gofal
Copy link
Contributor

gofal commented Aug 12, 2021

I kind of understand in this case, why no logging is done so far at that level...

yes, it was too much headache for most developers to think about where to get the logger instance from :-)

Sure, there are situations where you do not have such a context, for example if you just create a new DicomDataset somewhere in code outside of a DicomClient or DicomServer.
After thinking a while about it: what would you say to the idea of adding optional parameter of ILogger to the methods AND creating an own instance in DicomEncoding (you should do lazy creation, because we have to make sure that the configuration of ILoggerManager is done before the static constructor creates the ILogger instance). Then the logging is done with something like (loggerPassedAsParameter ?? staticCreatedLogger).Log(.....). It then would require to pass the logger of Dicomclient or DicomServer to those methods as good as possible. In all the other cases the logging is done without context.

@mrbean-bremen
Copy link
Collaborator Author

mrbean-bremen commented Aug 12, 2021

what would you say to the idea of adding optional parameter of ILogger to the methods AND creating an own instance in DicomEncoding

That sounds like a plan, though the logger would probably have to be added to several methods, which is kind of ugly... I understand that there can be more than one registered logger, therefore it would not be sufficient to just use the registered logger, and there is no concept of a current logger to be used.
Maybe there aren't so many methods to be adapted in the end, so it will not be a big deal.

@mrbean-bremen
Copy link
Collaborator Author

Hm, I'm not sure about the logging yet. The problem is: if I want to pass the logger into the encoding, I have to add it to DicomElement, because it is needed in DicomElement.StringValue. That would imply adding it to all DicomElement derivate constructors, including DicomMultiStringElement, which has a params string[] argument, so it cannot go last and cannot get a default value. Adding a logger to each element does not seem right anyway, so maybe some kind of a current logger would make sense here. I will have another look later...

@gofal
Copy link
Contributor

gofal commented Aug 15, 2021

I see. It's not easy.
What about loosely coupled. I mean instead of passing the logger as parameter into the DicomElement, the DicomElement could have an event with a handled-property.
The dicomElement could have a method that is called "DoLog" and that raises the log-event and if the handled property of the EventArgs is then still false then logging into the current logger taken from DI-container.
The DicomDatasets that are using these tags from within any context (like DicomClient or DicomServer) could attach this event, log into their scoped logger instance and set the handeld property to true.
Would that also generate too much overhead?
If yes, then I guess the DicomEncoding should just log into a default logger. It's then up to the developer to find the source of the logging by debugging into the application....

@mrbean-bremen
Copy link
Collaborator Author

Thank you. The event mechanism is certainly a better solution as it avoids passing the logger around too much. I'm not sure yet how the logging would be done in DicomEncoding in this case, as this is a static class without context, but I will check this.

Logging into a default logger does not sound like a good idea, if there already is a logger configured. Before doing anything, I probably first have to understand the current logging system better (I misunderstood this first, as I thought that all loggers are registered in a central instance). I didn't get to it today as I planned, but I will wait anyway for the other PR to be finished before starting a new one to avoid conflicts.

@mrbean-bremen
Copy link
Collaborator Author

The DicomDatasets that are using these tags from within any context (like DicomClient or DicomServer) could attach this event, log into their scoped logger instance and set the handeld property to true.

I may not understand this correctly. If a DicomClient or DicomServer subscribes to the event (that is what you mean by "attach the event", correct?), they have to subscribe to the event source, which would be each DicomElement of the data set - obviously not what you mean. Or do you mean to have a static event? Could you elaborate a bit on this?

@gofal
Copy link
Contributor

gofal commented Aug 20, 2021

You are absolutely right, my idea went into the wrong direction. Actually I thought of that the DicomClient or DicomServer subscribed the log-event of each Dicomelement. But that would cause a huge amout of issues.
The DicomElements are designed to be kind of immutable. When you update a value of a DicomTag, then the old DicomTag is removed from DicomDataset and the new one is added in the list. This allows that duplicating datasets or passing datasets around is fast, because only the references to the Tags are copied. If the DicomTags would have any kind of reference to the DicomDatasets - and if it would only be via events - then this architecture would break.

so now, after digging into this part of fo-dicom again (it was quite some time ago when I last did this), I see no chance to get some kind of context from DicomClient or Dicomserver into the DicomEncoding class. As long as you do not find any way to manage this, I only can see the possiblility, that the DicomEncoding class holds its own logger instance (getting it from DI-container or creating one directly or however) and log the warnings there. Maybe you could add the current ThreadID into the log entry, then the developer has a chance to detect by ThreadId or Timestamp the context, where the log entry belongs to....

@mrbean-bremen
Copy link
Collaborator Author

Thanks - this is unfortunate. I thought I had missed something. I will probably have another look to understand the logging concept, maybe I can think of something.

@gofal gofal mentioned this issue Aug 20, 2021
mrbean-bremen added a commit to mrbean-bremen/fo-dicom that referenced this issue Aug 22, 2021
- use log collector for unit tests
- see fo-dicom#1200
mrbean-bremen added a commit to mrbean-bremen/fo-dicom that referenced this issue Aug 23, 2021
- use log collector for unit tests
- see fo-dicom#1200
gofal added a commit that referenced this issue Sep 8, 2021
* Add error handling for decoding errors

- use log collector for unit tests
- see #1200

* Findings from review

- fix delimiter handling in fragments
- do not use a static logger for DicomEncoding
- use log collection only for encoding tests

* implement fixture in that way that it also runs green if the collections are executed in parallel

* fix: registered the serviceprovider on the wrong side of the compiler directive

Co-authored-by: Reinhard Gruber <gofal@users.noreply.github.com>
Co-authored-by: Reinhard Gruber <r.gruber@smartinmedia.com>
@gofal gofal closed this as completed Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants