Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define what a metric can be. #417

Closed
docwhat opened this issue Jun 26, 2015 · 6 comments
Closed

Define what a metric can be. #417

docwhat opened this issue Jun 26, 2015 · 6 comments

Comments

@docwhat
Copy link

docwhat commented Jun 26, 2015

Right now, graphite and carbon both share the concept of a metric, but they don't have any kind of validation.

I haven't tested it, but given the way the getFilesystemPath() works, I would assume you could craft a metric that would cause problems on the carbon server. Especially on windows.

I think it'd be beneficial if a metric was fully defined. Something like:

  1. A metric path contains metric nodes separated by ..
  2. A metric path may not exceed 1024 characters. (Or some limit. Windows NTFS has a path limit, depending on the filesystem type)
  3. A metric node may only contain characters that match the regexp: /[A-Za-z0-9_-]+/

This should prevent most/all exploits involving getFilesystemPath() in carbon.

It would also fix potential problems in graphite-web() mechanisms as well.

@docwhat
Copy link
Author

docwhat commented Jun 26, 2015

In addition, there probably should be an explanation of how a metric path is normalized:

  1. All duplicate .s are removed; e.g. s/\.\.+/\./g
  2. All leading and trailing . are removed.
  3. For each metric node, all non-permitted characters are removed.
  4. Any characters over the limit are removed.

@seh
Copy link

seh commented Jun 26, 2015

Potentially related: #181 and #331.

@Dieterbe
Copy link

All leading and trailing . are removed.

for leading . it seems to work fine, for trailing, i noticed carbon/whisper creates an extra directory with a file named .wsp inside of it.

@andyxning
Copy link

additionally, Any filesystem has a constraint on the length of the filename length.

@andyxning
Copy link

The final code for generating the file path for a metric is in database.py:

def getFilesystemPath(self, metric):
      metric_path = metric.replace('.', sep).lstrip(sep) + '.wsp'
      return join(self.data_dir, metric_path)

We can test it like this:

>>> '.a.b'.replace('.', os.path.sep).lstrip(os.path.sep) + '.wsp'
'a/b.wsp'
>>> '.a.b.'.replace('.', os.path.sep).lstrip(os.path.sep) + '.wsp'
'a/b/.wsp'

As shown in the above, metric like '.a.b' will be alright, however, metric like '.a.b.' will result in a path name like a/b/.wsp which will actually result in a hidden file .wsp. This may not be good.

Maybe we can change it with:

def getFilesystemPath(self, metric):
      # change 'a/b/.wsp' to 'a/b.wsp' 
      metric_path = metric.replace('.', sep).lstrip(sep).rstrip(sep) + '.wsp'
      return join(self.data_dir, metric_path)

@stale
Copy link

stale bot commented Apr 13, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants