Skip to content

Commit

Permalink
release v7; up CSV sniff perf; valid8n ignore sel
Browse files Browse the repository at this point in the history
RELEASE v7.0. Tested on Notepad++ versions
    7.3.3, 8.3.3, 8.4.1, 8.5.8, 8.6.1, and 8.6.2.
FIX: greatly improve performances of CSV sniffing
    by doing fast initial scan to determine EOL
CHANGE: automatic validation based on filename patterns
    now always parses and validates the entire file,
    even if the user has text selected that happens to be valid json
    at the time of the automatic validation.
  • Loading branch information
molsonkiko committed Feb 10, 2024
1 parent 5a418d7 commit 85bf009
Show file tree
Hide file tree
Showing 9 changed files with 133 additions and 88 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
- `loop()` function used in `s_sub` callbacks is not thread-safe. *This doesn't matter right now* because RemesPath is single-threaded, but it could matter in the future.
- __GrepperForm loses its JSON permanently when the buffer associated with its treeview is deleted.__
- Since v7.0, holding down `Enter` in a multiline textbox (like the [tree viewer query box](/docs/README.md#remespath)) only adds one newline when the key is lifted.
- Maybe refresh error form automatically when doing the automatic parse (but not schema validation) after editing?

## [7.0.0] - (UNRELEASED) YYYY-MM-DD
## [7.0.0] - 2024-02-09

### Added

Expand All @@ -75,6 +76,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
8. [Automatic linting after edits](/docs/README.md#automatic-validation-of-json-against-json-schema) will always attempt to parse the entire document, even if the user has made a selection that could be parsed as JSON.
9. Numbers with unnecessary leading 0's (like `01` or `002.5`) are now [logged at the `BAD` level](/docs/README.md#parser-settings), and numbers with trailing decimal points are now logged at the `JSON5` level.
10. [Error form](/docs/README.md#error-form-and-status-bar) keypress triggers now execute when the key is released, rather than when it is depressed.
11. [Automatic JSON schema validation](/docs/README.md#automatic-validation-of-json-against-json-schema) now ignores the user's selections and always validates the entire document.

### Fixed

Expand Down
80 changes: 58 additions & 22 deletions JsonToolsNppPlugin/Forms/RegexSearchForm.cs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ public partial class RegexSearchForm : Form
{
private JsonParser jsonParser;
private JsonSchemaValidator.ValidationFunc settingsValidator;
public static bool csvCheckChangeIsAutoTriggered = false;

public RegexSearchForm()
{
Expand Down Expand Up @@ -54,10 +55,12 @@ public RegexSearchForm()
"}"),
0);
GetTreeViewInRegexMode();
// check it, see if we have a CSV
ParseAsCsvCheckBox.Checked = true;
if (NColumnsTextBox.Text.Length == 0)
ParseAsCsvCheckBox.Checked = false;
// check if we have a CSV file
if (Main.settings.auto_try_guess_csv_delim_newline
&& TrySniffCommonDelimsAndEols(out EndOfLine eol, out char delim, out int nColumns))
{
SetCsvSettingsFromEolNColumnsDelim(true, eol, delim, nColumns);
}
}

public void GrabFocus()
Expand Down Expand Up @@ -185,36 +188,69 @@ public void ParseAsCsvCheckBox_CheckedChanged(object sender, EventArgs e)
RegexTextBox.Enabled = !showCsvButtons;
IgnoreCaseCheckBox.Enabled = !showCsvButtons;
IncludeFullMatchAsFirstItemCheckBox.Enabled = !showCsvButtons;
if (showCsvButtons && Main.settings.auto_try_guess_csv_delim_newline)
if (!csvCheckChangeIsAutoTriggered && showCsvButtons && Main.settings.auto_try_guess_csv_delim_newline
&& TrySniffCommonDelimsAndEols(out EndOfLine eol, out char delim, out int nColumns))
{
if (TrySniffCommonDelimsAndEols(out EndOfLine eol, out char delim, out int nColumns))
{
// we found possible NColumns, delimiter, and Newline values
NColumnsTextBox.Text = nColumns.ToString();
DelimiterTextBox.Text = ArgFunction.CsvCleanChar(delim);
QuoteCharTextBox.Text = "\"";
NewlineComboBox.SelectedIndex = eol == EndOfLine.CRLF ? 0 : eol == EndOfLine.LF ? 1 : 2;
}
SetCsvSettingsFromEolNColumnsDelim(true, eol, delim, nColumns);
}
}

private static bool TrySniffCommonDelimsAndEols(out EndOfLine eol, out char delim, out int nColumns)
public void SetCsvSettingsFromEolNColumnsDelim(bool csvBoxShouldBeChecked, EndOfLine eol, char delim, int nColumns)
{
if (ParseAsCsvCheckBox.Checked != csvBoxShouldBeChecked)
{
csvCheckChangeIsAutoTriggered = true;
ParseAsCsvCheckBox.Checked = csvBoxShouldBeChecked;
csvCheckChangeIsAutoTriggered = false;
}
if (csvBoxShouldBeChecked)
{
NColumnsTextBox.Text = nColumns.ToString();
DelimiterTextBox.Text = ArgFunction.CsvCleanChar(delim);
QuoteCharTextBox.Text = "\"";
NewlineComboBox.SelectedIndex = eol == EndOfLine.CRLF ? 0 : eol == EndOfLine.LF ? 1 : 2;
}
}

public static bool TrySniffCommonDelimsAndEols(out EndOfLine eol, out char delim, out int nColumns)
{
eol = EndOfLine.CRLF;
delim = '\x00';
nColumns = -1;
string text = Npp.editor.GetText(CsvSniffer.DEFAULT_MAX_CHARS_TO_SNIFF * 3 / 2);
foreach (EndOfLine maybeEol in new EndOfLine[]{EndOfLine.CRLF, EndOfLine.LF, EndOfLine.CR})
string text = Npp.editor.GetText(CsvSniffer.DEFAULT_MAX_CHARS_TO_SNIFF * 12 / 10);
int crlfCount = 0;
int crCount = 0;
int lfCount = 0;
int ii = 0;
while (ii < text.Length)
{
foreach (char maybeDelim in ",\t")
char c = text[ii++];
if (c == '\r')
{
nColumns = CsvSniffer.Sniff(text, maybeEol, maybeDelim, '"');
if (nColumns >= 2)
if (ii < text.Length && text[ii] == '\n')
{
delim = maybeDelim;
eol = maybeEol;
return true;
crlfCount++;
ii++;
}
else
crCount++;
}
else if (c == '\n')
lfCount++;
}
EndOfLine maybeEol = EndOfLine.CRLF;
if (crCount > crlfCount && crCount > lfCount)
maybeEol = EndOfLine.CR;
else if (lfCount > crCount && lfCount > crlfCount)
maybeEol = EndOfLine.LF;
foreach (char maybeDelim in ",\t")
{
nColumns = CsvSniffer.Sniff(text, maybeEol, maybeDelim, '"');
if (nColumns >= 2)
{
delim = maybeDelim;
eol = maybeEol;
return true;
}
}
return false;
Expand Down
15 changes: 11 additions & 4 deletions JsonToolsNppPlugin/Main.cs
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,7 @@ public static void docs()
/// <param name="wasAutotriggered">was triggered by a direct action of the user (e.g., reformatting, opening tree view)</param>
/// <param name="preferPreviousDocumentType">attempt to re-parse the document in whatever way it was previously parsed (potentially ignoring documentType parameter)</param>
/// <param name="isRecursion">IGNORE THIS PARAMETER, IT IS ONLY FOR RECURSIVE SELF-CALLS</param>
/// <param name="ignoreSelections">If true, always parse the entire file even the selected text is valid JSON.</param>
/// <returns></returns>
public static (ParserState parserState, JNode node, bool usesSelections, DocumentType DocumentType) TryParseJson(DocumentType documentType = DocumentType.JSON, bool wasAutotriggered = false, bool preferPreviousDocumentType = false, bool isRecursion = false, bool ignoreSelections = false)
{
Expand Down Expand Up @@ -1494,11 +1495,12 @@ static void ShowAboutForm()
/// parse the JSON schema,<br></br>
/// and try to validate the currently open file against the schema.<br></br>
/// Send the user a message telling the user if validation succeeded,
/// or if it failed, where the first error was.
/// or if it failed, where the first error was.<br></br>
/// If ignoreSelections, always validate the entire file even the selected text is valid JSON.
/// </summary>
public static void ValidateJson(string schemaPath = null, bool messageOnSuccess = true)
public static void ValidateJson(string schemaPath = null, bool messageOnSuccess = true, bool ignoreSelections = false)
{
(ParserState parserState, JNode json, _, DocumentType documentType) = TryParseJson(preferPreviousDocumentType:true);
(ParserState parserState, JNode json, _, DocumentType documentType) = TryParseJson(preferPreviousDocumentType:true, ignoreSelections: ignoreSelections);
if (parserState == ParserState.FATAL || json == null)
return;
string curFname = Npp.notepad.GetCurrentFilePath();
Expand Down Expand Up @@ -1748,7 +1750,7 @@ static bool ValidateIfFilenameMatches(string fname, bool wasAutotriggered = fals
var regex = ((JRegex)pat).regex;
if (!regex.IsMatch(fname)) continue;
// the filename matches a pattern for this schema, so we'll try to validate it.
ValidateJson(schemaFname, false);
ValidateJson(schemaFname, false, true);
return true;
}
}
Expand Down Expand Up @@ -1866,6 +1868,11 @@ public static void RegexSearchToJson()
if (!regexSearchForm.Focused)
{
regexSearchForm.GrabFocus();
if (settings.auto_try_guess_csv_delim_newline)
{
bool csvBoxShouldBeChecked = RegexSearchForm.TrySniffCommonDelimsAndEols(out EndOfLine eol, out char delim, out int nColumns);
regexSearchForm.SetCsvSettingsFromEolNColumnsDelim(csvBoxShouldBeChecked, eol, delim, nColumns);
}
}
}
}
Expand Down
4 changes: 2 additions & 2 deletions JsonToolsNppPlugin/Properties/AssemblyInfo.cs
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@
// Build Number
// Revision
//
[assembly: AssemblyVersion("6.1.1.20")]
[assembly: AssemblyFileVersion("6.1.1.20")]
[assembly: AssemblyVersion("7.0.0.0")]
[assembly: AssemblyFileVersion("7.0.0.0")]
Binary file modified JsonToolsNppPlugin/Release_x64.zip
Binary file not shown.
Binary file modified JsonToolsNppPlugin/Release_x86.zip
Binary file not shown.
14 changes: 7 additions & 7 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ We can perform RemesPath queries on the selections. __RemesPath queries (includi

![RemesPath query on file with selections](/docs/multi%20selections%20Remespath%20query.PNG)

Beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), [automatic linting after editing](#automatically-check-for-errors-after-editing) is disabled while in selection-based mode, to avoid unexpectedly changing the user's selections when the document is automatically parsed.
Beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), [automatic linting after editing](#automatically-check-for-errors-after-editing) is disabled while in selection-based mode, to avoid unexpectedly changing the user's selections when the document is automatically parsed.

### Selecting all valid JSON ###

Expand Down Expand Up @@ -200,7 +200,7 @@ For performance reasons, the error form will never have more than 5000 rows. The

__For pre-[v6.1](/CHANGELOG.md#610---2023-12-28) JsonTools, *do not click `Yes`* on the dialog that warns of slow reload.__ If you click `Yes`, you can expect to wait an *extremely long time.*

Beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), the error form also reports JSON schema validation errors. They are indicated by `SCHEMA` in the `Severity` column as shown below. In addition, if a file was previously validated, hitting `Enter` to refresh the error form re-validates the file using whatever schema was most recently used for that file.
Beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), the error form also reports JSON schema validation errors. They are indicated by `SCHEMA` in the `Severity` column as shown below. In addition, if a file was previously validated, hitting `Enter` to refresh the error form re-validates the file using whatever schema was most recently used for that file.

![Error form reporting schema validation errors](/docs/error%20form%20with%20SCHEMA%20errors.PNG)

Expand Down Expand Up @@ -246,7 +246,7 @@ This is off by default. If desired, this feature can be turned on in the setting

Prior to [v6.1](/CHANGELOG.md#610---2023-12-28), this automatic validation forced the file to be parsed as JSON. As of v6.1, the document will be parsed as [JSON Lines](#json-lines-documents) if the file extension is `jsonl` and as JSON otherwise. In addition, if the document is already in [regex mode](#regex-search-form) or [ini file mode](#parsing-ini-files), automatic validation is suspended.

Beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), this automatic validation will only ever attempt to parse the entire document, not [a selection](#working-with-selections), and automatic validation is always disabled in selection-based mode. Prior to v7.0, automatic validation could change the user's selections unexpectedly.
Beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), this automatic validation will only ever attempt to parse the entire document, not [a selection](#working-with-selections), and automatic validation is always disabled in selection-based mode. Prior to v7.0, automatic validation could change the user's selections unexpectedly.

## Path to current position ##

Expand Down Expand Up @@ -516,7 +516,7 @@ Suppose you start with this document:
} // gets moved to the very end of the doc when pretty-printing
]
```
__Pretty-printing while remembering comments produces this__ (although note that beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), this is only true if your [pretty_print_style](#pretty_print_style) is `Whitesmith` or `Google`):
__Pretty-printing while remembering comments produces this__ (although note that beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), this is only true if your [pretty_print_style](#pretty_print_style) is `Whitesmith` or `Google`):
```json
// python comments become JavaScript single-line
[
Expand Down Expand Up @@ -547,7 +547,7 @@ __Compressing while remembering comments produces this:__
[1, 2, 3, {"a": [1, [1.5]]}]
```

Beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), choosing the `PPrint` setting for [pretty_print_style](#pretty_print_style) causes comments to be remembered as follows:
Beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), choosing the `PPrint` setting for [pretty_print_style](#pretty_print_style) causes comments to be remembered as follows:
```json
[
["short", {"iterables": "get", "printed": "on", "one": "line"}],
Expand Down Expand Up @@ -627,7 +627,7 @@ Opening up a document in regex mode allows __querying and mutating the raw text

You can view CSV files (any delimiter, quote character, and newline are allowed) with the treeview, providing that they comply with [RFC 4180](https://www.ietf.org/rfc/rfc4180.txt).

Beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), if the new `auto_try_guess_csv_delim_newline` global setting is set to `true`, whenever the regex search form is opened, or the `Parse as CSV?` button is toggled on, the regex search form will check the first 1600 characters of the current document to detect if it is a CSV or TSV file. This makes the regex search form load more slowly, but it makes it easier to parse CSV files.
Beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), if the new `auto_try_guess_csv_delim_newline` global setting is set to `true`, whenever the regex search form is opened, or the `Parse as CSV?` button is toggled on, the regex search form will check the first 1600 characters of the current document to detect if it is a CSV or TSV file. This makes the regex search form load more slowly, but it makes it easier to parse CSV files.

![Regex search form viewing a CSV file](/docs/regex%20search%20form%20csv%20example.PNG)

Expand Down Expand Up @@ -793,7 +793,7 @@ Click the `View errors` button to see if any errors happened. If any did, a new

As of version *4.6.0*, the plugin can validate JSON against a [JSON schema](https://json-schema.org/). If the schema is valid, a message box will tell you if your JSON validates. If it doesn't validate, the plugin will tell you the first location where validation failed.

Beginning in [v7.0](/CHANGELOG.md#700---unreleased-yyyy-mm-dd), validators can catch multiple JSON schema validation problems, not just one. You can use the [error form](#error-form-and-status-bar) to see where all of the schema validation problems are.
Beginning in [v7.0](/CHANGELOG.md#700---2024-02-09), validators can catch multiple JSON schema validation problems, not just one. You can use the [error form](#error-form-and-status-bar) to see where all of the schema validation problems are.

As of version [4.11.2](/CHANGELOG.md#4112---2023-03-21), the recursion limit for validation is currently 64. Deeper JSON than that can't be validated, period. Very deep or recursive schemas will still compile.

Expand Down
Loading

0 comments on commit 85bf009

Please sign in to comment.