Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect-secrets audit file issue #272

Closed
Staggerlee011 opened this issue Dec 10, 2019 · 9 comments
Closed

detect-secrets audit file issue #272

Staggerlee011 opened this issue Dec 10, 2019 · 9 comments

Comments

@Staggerlee011
Copy link

Staggerlee011 commented Dec 10, 2019

Capture
Hi,

When i run a audit of a baseline i get the error:

Not a valid baseline file!

runing on Windows 10, pip install of detect-secrets

detect-secrets  --version
0.13.0

Setting up baseline:

detect-secrets scan > .secrets.baseline

Running the audit:

detect-secrets audit .secrets.baseline

Not sure how to solve it. any suggestions?

@domanchi
Copy link
Contributor

@Staggerlee011, what does your baseline file look like?

@Staggerlee011
Copy link
Author

hi wouldnt let me upload to here so posted it to link

@Staggerlee011
Copy link
Author

hey @domanchi can you make sense of it?

@gadeweever
Copy link

gadeweever commented Apr 24, 2020

Hey same issue here, but this is on 0.13.1

C:\...\spectacular-potatoes [feature/testsecret +2 ~1 -0 !]> detect-secrets scan > .secrets.baseline
C:\..\spectacular-potatoes [feature/testsecret +3 ~1 -0 !]> git add LanguageContact/API/Web.config
C:\..\spectacular-potatoes [feature/testsecret +0 ~1 -0 | +3 ~0 -0 !]> git commit -m "test secret"
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...........................................(no files to check)Skipped
Check for added large files..............................................Passed
Detect secrets...........................................................Failed
- hook id: detect-secrets
- exit code: 1

Incorrectly formatted baseline!

C:\..\spectacular-potatoes [feature/testsecret +0 ~1 -0 | +3 ~0 -0 !]> python --version
Python 3.7.2

Without pre-commit:

C:\..\spectacular-potatoes [feature/testsecret +0 ~1 -0 | +3 ~0 -0 !]>  detect-secrets audit .secrets.baseline
Not a valid baseline file!

I'm still trying to investigate why this would be the case but it looks like the actual error is not logged according to: https://github.com/Yelp/detect-secrets/blob/master/detect_secrets/core/secrets_collection.py LINE 63. I don't have a python build setup.

Anyone know why this might be happening?

EDIT:

I tried to create a toy example to see if I could get a stack trace with this:

import os

from detect_secrets import VERSION
from detect_secrets.core.baseline import get_secrets_not_in_baseline
from detect_secrets.core.baseline import trim_baseline_of_removed_secrets
from detect_secrets.core.common import write_baseline_to_file
from detect_secrets.core.log import get_logger
from detect_secrets.core.secrets_collection import SecretsCollection
from detect_secrets.core.usage import ParserBuilder
from detect_secrets.plugins.common import initialize
from detect_secrets.util import build_automaton

log = get_logger(format_string='%(message)s')

def get_baseline(baseline_filename):
    """
    :raises: IOError
    :raises: ValueError
    """
    if not baseline_filename:
        return

    fileContents = _get_baseline_string_from_file(
            baseline_filename,
        )

    log.error(fileContents[2:])

    return SecretsCollection.load_baseline_from_string(
        fileContents[2:]
    )

def _get_baseline_string_from_file(filename):  # pragma: no cover
    """Breaking this function up for mockability."""
    try:
        with open(filename) as f:
            return f.read()

    except IOError:
        log.error(
            'Unable to open baseline file: {}\n'
            'Please create it via\n'
            '   `detect-secrets scan > {}`\n'
            .format(filename, filename),
        )
        raise

get_baseline(".secrets.baseline")

This generated the following exception:

C:\Users\animo\Documents\GitHub\detect-secrets\detect_secrets\core>python testbaseline.py
Incorrectly formatted baseline!
Traceback (most recent call last):
  File "testbaseline.py", line 48, in <module>
    get_baseline(".secrets.baseline")
  File "testbaseline.py", line 30, in get_baseline
    fileContents
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\site-packages\detect_secrets\core\secrets_collection.py", line 64, in load_baseline_from_string
    return cls.load_baseline_from_dict(json.loads(string))
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

So it looks like the exception is coming from the JSON lib. I stripped down the baseline file to just the generated_at key and added a log statement to see what JSON is being passed to SecretsCollection and found this:

ÿþ{

     " g e n e r a t e d _ a t " :   " 2 0 2 0 - 0 4 - 2 4 T 1 6 : 1 8 : 4 2 Z "

 }

There's two garbage characters at the beginning. So I modified the toy example with:

    return SecretsCollection.load_baseline_from_string(
        fileContents[2:]
    )

The two characters are removed, but the JSON lib still generates the following exception:

{
     " g e n e r a t e d _ a t " :   " 2 0 2 0 - 0 4 - 2 4 T 1 6 : 1 8 : 4 2 Z "
 }
Incorrectly formatted baseline!
Traceback (most recent call last):
  File "testbaseline.py", line 48, in <module>
    get_baseline(".secrets.baseline")
  File "testbaseline.py", line 30, in get_baseline
    fileContents[2:]
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\site-packages\detect_secrets\core\secrets_collection.py", line 64, in load_baseline_from_string
    return cls.load_baseline_from_dict(json.loads(string))
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\animo\AppData\Local\Programs\Python\Python37\lib\json\decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

I don't know anything about python so apologies if this is way off base, but I suspect there's some sort of environment issue or version mismatch with file.read, unless that's just the way the log is formatted.

Anyone else have an idea?

EDIT:

Found the issue! The characters at the beginning was the clue. I work in a multi-lingual environment, and set my Powershell default encoding export to something other than UTF-8 to test the output of a different console program. .secrets.baseline was saved as UTF-16LE. I resaved the file as UTF-8 and everything is happy now 😌 . If something else comes up I'll report back.

@rkapoor028
Copy link

Great Find!

Was facing the same issue, somehow the .secrets.baseline was encoded as UCS-2 LE BOM. Changing the encoding to UTF-8 started working for me again.

killuazhu pushed a commit to IBM/detect-secrets that referenced this issue May 28, 2020
killuazhu pushed a commit to IBM/detect-secrets that referenced this issue Jul 9, 2020
killuazhu pushed a commit to IBM/detect-secrets that referenced this issue Sep 17, 2020
@domanchi
Copy link
Contributor

This will be added to the FAQ. Closing as done.

@hramadoss
Copy link

hramadoss commented Apr 21, 2021

git commit -m "testing1"
[WARNING] Unstaged files detected.
[INFO] Stashing unstaged files to /Users/hh/.cache/pre-commit/patch1619032024.
Detect secrets...........................................................Failed
- hook id: detect-secrets
- exit code: 1

Incorrectly formatted baseline!

Getting the same error. I checked the default encoding it is set to UTF-8.

>>> import sys
>>> sys.getdefaultencoding()
'utf-8'

This is on v1.1.0 running Python 3.7.9

@ssiegel95
Copy link

The FAQ states "Ensure the file encoding of your baseline file is UTF-8". I believe I have but I'm still seeing the issue. I'm doing this under WSL2 (linux).

>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
$ detect-secrets scan > .secrets.baseline
$ file -i .secrets.baseline
.secrets.baseline: text/plain; charset=us-ascii
$ detect-secrets audit .secrets.baseline
Nothing to audit!
$ git commit -m"test" requirements.txt
[WARNING] Unstaged files detected.
[INFO] Stashing unstaged files to /home/stus/.cache/pre-commit/patch1647634566-18621.
prettier.................................................................Passed
black................................................(no files to check)Skipped
isort................................................(no files to check)Skipped
Detect secrets...........................................................Failed
- hook id: detect-secrets
- exit code: 1

Incorrectly formatted baseline!

@vteran93
Copy link

vteran93 commented Dec 9, 2023

Save the file with UTF-8 with BOM

That way the output will be

file -i .secrets.baseline
.secrets.baseline: text/plain; charset=utf-8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants