Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for yaml file format #88

Open
gabriel-vasile opened this issue Mar 3, 2020 · 7 comments
Open

Add support for yaml file format #88

gabriel-vasile opened this issue Mar 3, 2020 · 7 comments
Labels
enhancement New feature or request
Milestone

Comments

@gabriel-vasile
Copy link
Owner

  1. Specify the MIME type and extension for which to add support
    application/x-yaml or text/yaml
  2. Share an example file
    https://github.com/kubernetes-sigs/kustomize/blob/master/examples/helloWorld/configMap.yaml
  3. Optionally, add a reference to the specification of the file format.
    https://yaml.org
    My approach to this would be similar to the JSON detection. A scanner can validate YAML text and return the index where a possible error occurred.
@gabriel-vasile gabriel-vasile added the enhancement New feature or request label Mar 3, 2020
@gabriel-vasile gabriel-vasile added this to the v1.0.4 milestone Mar 7, 2020
@tebrizetayi
Copy link
Contributor

Is someone working on this issue?

@gabriel-vasile
Copy link
Owner Author

As far as I know, no one is working on yaml. However, you should know writing an yaml scanner is quite a time consuming task.
If you want to go for other file formats, off the top of my head i can name:
cpio, lzip, java-archives, corelDRAW files, zoo archives, bittorent files.

Good resources for how to identify these formats:
https://www.garykessler.net/library/file_sigs.html
https://github.com/file/file

@tebrizetayi
Copy link
Contributor

tebrizetayi commented Mar 22, 2020

However, you should know writing an yaml scanner is quite a time consuming task.

Why don't we use standart golang library for yaml?

If you want to go for other file formats, off the top of my head i can name:
cpio, lzip, java-archives, corelDRAW files, zoo archives, bittorent files.

Jar-archives is already done.
I can do it either

@tebrizetayi
Copy link
Contributor

I want to write a matcher for the CorelDRAW file. But for checking it, I need also to send a filesize information to the matcher function. Matcher function accepts only one argument. How can we send filesize with byte array to matcher function?

@gabriel-vasile
Copy link
Owner Author

Unfortunately the size of the file is not available, main reason for it being that the library limits itself to reading just the header of files in order to save memory.

I'm not familiar with Corel file format, but after reading the wikipedia info and how others detect it I think it can be detected without knowing the length.

As far as I can see official Corel file format specification is not publicly available.
Can you please link your source so we can compare it with what I've found and sort this out?

@tebrizetayi
Copy link
Contributor

https://www.ntfs.com/corel-draw-format.htm

Second byte fragment is for checking the file size. I checked it with my custom .cdr file and it works.

@gabriel-vasile
Copy link
Owner Author

I guess you can exclude the check for file size and just check for the magic numbers. Don't forget to add all the aliases this MIME has. (tika) ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants