Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: multiline regex allowed in --replace-text? #58

Open
marscher opened this issue Oct 15, 2014 · 16 comments
Open

Question: multiline regex allowed in --replace-text? #58

marscher opened this issue Oct 15, 2014 · 16 comments

Comments

@marscher
Copy link

background is I want to strip output of an IPython notebook, which uses json to store its data. So I would need to match the last bracket of an "outputs" dictionary.

@rtyley
Copy link
Owner

rtyley commented Oct 15, 2014

Could you give a truncated example- before and after?

@marscher
Copy link
Author

before:

 "outputs": [                                                               
  {                                                                         
   "output_type": "stream",                                                 
   "stream": "stdout",                                                      
   "text": [                                                                
    "Populating the interactive namespace from numpy and matplotlib\n"      
   ]                                                                        
  }                                                                         
 ],

after:

 "outputs": []

@marscher
Copy link
Author

maybe it works out of the box, but I have not tested yet:
http://www.mkyong.com/regular-expressions/regular-expression-matches-multiple-line-example-java/

@rtyley
Copy link
Owner

rtyley commented Oct 15, 2014

I should think adding the (?s) prefix would work. Remember to prefix the entire expression with regex: too, I think the text is assumed to just be a text literal by the BFG otherwise.

@marscher
Copy link
Author

My regex matches, if I use Java, the same regex (prefixed with regex:) leads to "bfg aborting : no refs to update". Tried both with escaped backslashes and without.

 regex:(?s)(\\s+)\"outputs\":\\s+\\[[^\\]](.+)\\1\\] 

@marscher
Copy link
Author

works well at last. Thank you

@rtyley
Copy link
Owner

rtyley commented Oct 16, 2014

works well at last. Thank you

Great! What was the problem just before that (with #58 (comment))? How did you fix it?

@marscher
Copy link
Author

you do not need to escape backslashes and I found one typo. Best.

@rtyley
Copy link
Owner

rtyley commented Oct 16, 2014

Cool - thanks.

@marscher
Copy link
Author

Unfortunately, I performed a wrong test, when closing this. Multi line regex does not seem to work:
Testing example: https://github.com/marscher/bfg_multiline_regex

@marscher marscher reopened this Oct 20, 2014
@antaflos
Copy link

Just stumbled over this as well.

We kept RSA private keys in a YAML file like this (not a real private key, obviously):

private_keys:
  foo_key1.pem:
    content: |
      -----BEGIN RSA PRIVATE KEY-----
      MIIEogIBAAKKAQEAzWXZ7ZdzGe5aez+vZKsaHI4e0hRF57BoewZTOKlmF2ijVqDK
      QveAW42R1KENm4t3/ikMV0wzMjA2WZX6wpb94brw1VeTiTs7y3I6/7OgMEVrmZ/T
      eKk2JGahHqdqA3+BEsjK9OjlYgjXGtho0qnKdt5kZjv3kA2R9dwZJzghiTrqrKKI
      BN2fatZtI+MzvAv5+i91AthSzaqmO31SbZZ/ZK0vb0ehlt6oZs1Z+KZW4yo206lZ
      1lK4B4nIZF3Rn2mmD6jRs/BAIMFm/AeFOzndsxqAyAxZKKHqK5l+ZDld6J3xiKtZ
      sHQK9ijXR1iTBA6Sd4HO3QOx/+BzbbsNQnei5QIDAQABAoIBAFGSGMg1rFZxBXgK
      ZABtd/KNxBm0dNM2bqQ1GWM/k+15iZ5miZkPElRRN9/sK1K4AVbPxeGpZf1XFp7J
      fol0OW159kJnmXvNVK30ZQKBgD7BMnJoD3GHPzmyyzov4yx/GK97bH6Sa7tIK1/V
      oBQoNGye+93VZ+2E6KN/oZOKRKH7rlgf6vtKtKM00fMLA2yb52s4G7pj9MDvY0k3
      Wdo1VWgj51rPgjb3X5h4wvmoo61IZBYtmw5/iT/DZVyMs7l1vOaGF6pATKQsZybv
      KbQRAoGARhpaOngoK0qG2rtl34z9TXvZT3XMbLmDaJ+jDjjtEazOt5jS7EP1ppSK
      rRSiqZ5Sv5NqzisN6OZHLdt1JNwarNZnNItGfs/PmpZb7SSfJqarGmNx25OK7JPe
      7geK4x9I71G9HE6aMtVK4S05KpBeA4zT7gEZ51yf4hDTf1KZSL1=
      -----END RSA PRIVATE KEY-----

Now we moved those keys to another backend and would like to clean up the repo so that the YAML looks something like this:

private_keys:
  foo_key1.pem:
    content: |
      -----BEGIN RSA PRIVATE KEY-----
      REMOVED
      -----END RSA PRIVATE KEY-----

We could just nuke the YAML files that contained private keys but I'd prefer keeping them and their history around.

A regex that should match the characters between the BEGIN and END markers (according to regex101.com) is this:

(?s)-----BEGIN RSA PRIVATE KEY-----(.+)-----END RSA PRIVATE KEY-----

Unfortunately running bfg with this replace.txt doesn't work:

regex:(?s)-----BEGIN RSA PRIVATE KEY-----(.+)-----END RSA PRIVATE KEY-----==>REMOVED
java -jar ~/Downloads/bfg-1.11.8.jar --replace-text replace.txt -fi private_keys.yaml repo.git

...

Cleaning
--------

Found 1294 commits
Cleaning commits:       100% (1294/1294)
Cleaning commits completed in 618 ms.

BFG aborting: No refs to update - no dirty commits found?

What to do?

@antaflos
Copy link

antaflos commented May 8, 2015

FWIW, this is still an issue in bfg-1.12.3.

@franekrichardson
Copy link

I have raised a PR #168 with a fix for this.

It allows you to add support using an optional command line param (default behaviour is as is), because it will change the processing to load the entire blob instead of a line at a time (which could cause problems if you have giant commits..!)

@jessehouwing
Copy link

Would love this too in order to support jsonpath or xpath search&replacements: #265

@caglarsayin
Copy link

This is still not working. Are you planning to follow it up ?

@ESWZY
Copy link

ESWZY commented May 3, 2020

I have raised a PR #168 with a fix for this.

It allows you to add support using an optional command line param (default behaviour is as is), because it will change the processing to load the entire blob instead of a line at a time (which could cause problems if you have giant commits..!)

It works! I compiled your code, and used this command.

java -jar bfg.jar --multi-line-regex --replace-text replace.txt

My replace.txt is

regex:(?s)if __name__ == '__main__':[.\s\S]*==>print("DELETED!!")

I successfully replace all code after

if __name__ == '__main__':

thx~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants