Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoders for sources #257

Closed
lukesteensen opened this issue Apr 8, 2019 · 2 comments
Closed

Decoders for sources #257

lukesteensen opened this issue Apr 8, 2019 · 2 comments
Labels
domain: codecs Anything related to Vector's codecs (encoding/decoding) domain: data model Anything related to Vector's internal data model domain: logs Anything related to Vector's log events domain: sources Anything related to the Vector's sources have: should We should have this feature, but is not required. It is medium priority. needs: requirements Needs a a list of requirements before work can be begin type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@lukesteensen
Copy link
Member

For applicable sources, you should be able to configure a decoder (e.g. JSON) that will fully parse and structure the incoming data. The difference between a source-configured decoder and a downstream parsing transform is that a source-configured decoder will be expected to completely parse the entire message into structured data and drop the original raw bytes representation.

This will also require figuring out how downstream transforms and sinks will handle this new type of record with no raw data. Similar to #256, we may need to add a conditional or match to detect this case and provide a different default behavior that makes sense for this kind of data.

This would also benefit greatly from #235, so that we could detect and prevent cases where downstream components would try to read the raw original data by default and end up with surprising behavior.

Related to both of the above, there may be decoders (syslog comes to mind) where it makes sense to set one parsed field (e.g. message) as the default target of downstream parsing components.

@lukesteensen lukesteensen added type: enhancement A value-adding code change that enhances its existing functionality. Core: Data Processing labels Apr 8, 2019
@lukesteensen lukesteensen added this to the 0.1.1 milestone Apr 8, 2019
@binarylogic binarylogic removed this from the 0.1.1 milestone Jun 7, 2019
@binarylogic binarylogic added this to the Initial schema support milestone Feb 25, 2020
@Hoverbear
Copy link
Contributor

Perhaps it would be reasonable to have this share code with the downstream transforms by having it so users specify the transform in codec (or decoder?)

Eg.

[source.console.foo]
# ...
codec = "json_parser"

@binarylogic binarylogic added the needs: requirements Needs a a list of requirements before work can be begin label Mar 9, 2020
@binarylogic binarylogic removed this from the Initial Schema Support milestone Aug 2, 2020
@binarylogic binarylogic added domain: logs Anything related to Vector's log events domain: data model Anything related to Vector's internal data model have: should We should have this feature, but is not required. It is medium priority. domain: codecs Anything related to Vector's codecs (encoding/decoding) domain: sources Anything related to the Vector's sources and removed event type: log labels Aug 6, 2020
@tobz
Copy link
Contributor

tobz commented Nov 5, 2021

This should be addressed already by the framing/decoding work that @pablosichert has done for sources, and anything leftover would be addressed by either #9388 or #9930.

Gonna close for now.

@tobz tobz closed this as completed Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: codecs Anything related to Vector's codecs (encoding/decoding) domain: data model Anything related to Vector's internal data model domain: logs Anything related to Vector's log events domain: sources Anything related to the Vector's sources have: should We should have this feature, but is not required. It is medium priority. needs: requirements Needs a a list of requirements before work can be begin type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

No branches or pull requests

4 participants