Skip to content

Commit

Permalink
Escape NULL (#98)
Browse files Browse the repository at this point in the history
Co-authored-by: Aleksandr Kirillov <saratovsource@gmail.com>
  • Loading branch information
evgeniy-r and akirill0v committed Oct 3, 2021
1 parent a68c4b8 commit 6e686dc
Show file tree
Hide file tree
Showing 3 changed files with 89 additions and 12 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]
### 🚀 Added
- Implement returning NULL values for PostgreSQL from transformers
[#98](https://github.com/datanymizer/datanymizer/pull/98) ([@evgeniy-r](https://github.com/evgeniy-r))
- Configurable dump transaction (whether to use, an isolation level)
[#99](https://github.com/datanymizer/datanymizer/pull/99) ([@evgeniy-r](https://github.com/evgeniy-r))
[#96](https://github.com/datanymizer/datanymizer/pull/96) ([@evgeniy-r](https://github.com/evgeniy-r))

### ⚙️ Changed

Expand Down
80 changes: 70 additions & 10 deletions datanymizer_dumper/src/postgres/escaper.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,25 @@
// https://www.postgresql.org/docs/13/sql-copy.html#id-1.9.3.55.9.2
/// The escaper for values from transformers.
/// The character escaping rules for the PostgreSQL COPY command are described here:
/// https://www.postgresql.org/docs/13/sql-copy.html#id-1.9.3.55.9.2
/// If we need a NULL value in our database, we must return `\N` from the transformer.
/// Example:
/// ```yaml
/// template:
/// format: '\N'
/// ```
/// If you need the `\N` literal in your database, please return `\\N` from the transformer.
/// If you need the `\\N` literal - return `\\\N` and so on.
///
/// Warning! This behavior can be changed in the future.
pub fn replace_chars(s: &mut String) {
if s == r#"\N"# {
return;
}

let len = s.len();
let mut new_s = None;
let mut beginning = 0;
let mut slash_count = 0;

for (i, c) in s.chars().enumerate() {
if let Some(replacement) = match c {
Expand All @@ -12,7 +29,10 @@ pub fn replace_chars(s: &mut String) {
'\r' => Some(r#"\r"#),
'\t' => Some(r#"\t"#),
'\x0B' => Some(r#"\v"#),
'\\' => Some(r#"\\"#),
'\\' => {
slash_count += 1;
Some(r#"\\"#)
}
_ => None,
} {
if new_s.is_none() {
Expand All @@ -29,6 +49,14 @@ pub fn replace_chars(s: &mut String) {
}

if let Some(mut new_s) = new_s {
if slash_count == len - 1 && s.ends_with('N') {
if slash_count == 2 {
return;
} else {
new_s.truncate((slash_count - 1) * 2);
}
}

if beginning < len {
new_s.push_str(&s[beginning..len])
}
Expand All @@ -44,55 +72,87 @@ mod tests {
fn replace() {
let mut s = String::from("abc\ndef");
replace_chars(&mut s);
assert_eq!(s, r#"abc\ndef"#)
assert_eq!(s, r#"abc\ndef"#);
}

#[test]
fn several() {
let mut s = String::from("abc\r\nde\tf");
replace_chars(&mut s);
assert_eq!(s, r#"abc\r\nde\tf"#)
assert_eq!(s, r#"abc\r\nde\tf"#);
}

#[test]
fn empty() {
let mut s = String::from("");
replace_chars(&mut s);
assert_eq!(s, "")
assert_eq!(s, "");
}

#[test]
fn at_beginning() {
let mut s = String::from("\t123");
replace_chars(&mut s);
assert_eq!(s, r#"\t123"#)
assert_eq!(s, r#"\t123"#);
}

#[test]
fn at_end() {
let mut s = String::from("abc\n");
replace_chars(&mut s);
assert_eq!(s, r#"abc\n"#)
assert_eq!(s, r#"abc\n"#);
}

#[test]
fn slashes() {
let mut s = String::from(r#"\ab\\c\n"#);
replace_chars(&mut s);
assert_eq!(s, r#"\\ab\\\\c\\n"#)
assert_eq!(s, r#"\\ab\\\\c\\n"#);
}

#[test]
fn only_replacements() {
let mut s = String::from("\r\n");
replace_chars(&mut s);
assert_eq!(s, r#"\r\n"#)
assert_eq!(s, r#"\r\n"#);
}

#[test]
fn all_sequences() {
let mut s = String::from("\ta\x0Bb\\c\x08\x0C\r\n");
replace_chars(&mut s);
assert_eq!(s, r#"\ta\vb\\c\b\f\r\n"#)
assert_eq!(s, r#"\ta\vb\\c\b\f\r\n"#);
}

mod null_like_sequences {
use super::*;

#[test]
fn one_slash() {
let mut s = String::from(r#"\N"#);
replace_chars(&mut s);
assert_eq!(s, r#"\N"#);
}

#[test]
fn two_slashes() {
let mut s = String::from(r#"\\N"#);
replace_chars(&mut s);
assert_eq!(s, r#"\\N"#);
}

#[test]
fn five_slashes() {
let mut s = String::from(r#"\\\\\N"#);
replace_chars(&mut s);
assert_eq!(s, r#"\\\\\\\\N"#);
}

#[test]
fn null_sequence_inside_string() {
let mut s = String::from(r#"test\Nstring"#);
replace_chars(&mut s);
assert_eq!(s, r#"test\\Nstring"#);
}
}
}
17 changes: 16 additions & 1 deletion docs/transformers.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ then there are no configuration options for it.

### Locales

Also many of them support locale configuration:
Also, many of them support locale configuration:

```yaml
# gets a person last name
Expand Down Expand Up @@ -322,6 +322,21 @@ The full list of functions for working with the store:
* `store_inc` - increments a value in a key (in the first time just stores a value). Working only with numbers.<br/>
Arguments: `key`, `value`.

Also, you can use the template transformer for returning NULL values for your database.

For PostgreSQL, we must return `\N` from the transformer:

```yaml
template:
format: '\N'
```

If you need the `\N` literal in your database, please return `\\N` from the transformer.

If you need the `\\N` literal - return `\\\N` and so on.

**Warning!** This behavior can be changed in the future.

## Business

#### company_activity 🌐
Expand Down

0 comments on commit 6e686dc

Please sign in to comment.