Skip to content

Commit 45fc83d

Browse files
committed
docs: search & searchset are now parallelizable with an index
[skip ci]
1 parent e6c14b9 commit 45fc83d

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,8 @@
7474
| [safenames](/src/cmd/safenames.rs#L2)<br>![CKAN](docs/images/ckan.png) | <a name="safenames_deeplink"></a>Modify headers of a CSV to only have ["safe" names](/src/cmd/safenames.rs#L5-L14) - guaranteed "database-ready"/"CKAN-ready" names. |
7575
| [sample](/src/cmd/sample.rs#L2)<br>📇🌐🏎️ | Randomly draw rows (with optional seed) from a CSV using seven different sampling methods - [reservoir](https://en.wikipedia.org/wiki/Reservoir_sampling) (default), [indexed](https://en.wikipedia.org/wiki/Random_access), [bernoulli](https://en.wikipedia.org/wiki/Bernoulli_sampling), [systematic](https://en.wikipedia.org/wiki/Systematic_sampling), [stratified](https://en.wikipedia.org/wiki/Stratified_sampling), [weighted](https://doi.org/10.1016/j.ipl.2005.11.003) & [cluster sampling](https://en.wikipedia.org/wiki/Cluster_sampling). Supports sampling from CSVs on remote URLs. |
7676
| [schema](/src/cmd/schema.rs#L2)<br>📇😣🏎️👆🪄🐻‍❄️ | <a name="schema_deeplink"></a>Infer either a [JSON Schema Validation Draft 2020-12](https://json-schema.org/draft/2020-12/json-schema-validation) ([Example](https://github.com/dathere/qsv/blob/master/resources/test/311_Service_Requests_from_2010_to_Present-2022-03-04.csv.schema.json)) or [Polars Schema](https://docs.pola.rs/user-guide/lazy/schemas/) ([Example](https://github.com/dathere/qsv/blob/master/resources/test/NYC_311_SR_2010-2020-sample-1M.pschema.json)) from CSV data.<br>In JSON Schema Validation mode, it produces a `.schema.json` file replete with inferred data type & domain/range validation rules derived from [`stats`](#stats_deeplink). Uses multithreading to go faster if an index is present. See [`validate`](#validate_deeplink) command to use the generated JSON Schema to validate if similar CSVs comply with the schema.<br>With the `--polars` option, it produces a `.pschema.json` file that all polars commands (`sqlp`, `joinp` & `pivotp`) use to determine the data type of each column & to optimize performance.<br>Both schemas are editable and can be fine-tuned. For JSON Schema, to refine the inferred validation rules. For Polars Schema, to change the inferred Polars data types. |
77-
| [search](/src/cmd/search.rs#L2)<br>📇👆 | Run a regex over a CSV. Applies the regex to selected fields & shows only matching rows. |
78-
| [searchset](/src/cmd/searchset.rs#L2)<br>📇👆 | _Run multiple regexes over a CSV in a single pass._ Applies the regexes to each field individually & shows only matching rows. |
77+
| [search](/src/cmd/search.rs#L2)<br>📇🏎️👆 | Run a regex over a CSV. Applies the regex to selected fields & shows only matching rows. |
78+
| [searchset](/src/cmd/searchset.rs#L2)<br>📇🏎️👆 | _Run multiple regexes over a CSV in a single pass._ Applies the regexes to each field individually & shows only matching rows. |
7979
| [select](/src/cmd/select.rs#L2)<br>👆 | Select, re-order, reverse, duplicate or drop columns. |
8080
| [slice](/src/cmd/slice.rs#L2)<br>📇🏎️🗃️ | Slice rows from any part of a CSV. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice). |
8181
| [snappy](/src/cmd/snappy.rs#L2)<br>🚀🌐 | <a name="snappy_deeplink"></a>Does streaming compression/decompression of the input using Google's [Snappy](https://github.com/google/snappy/blob/main/docs/README.md) framing format ([more info](#automatic-compressiondecompression)). |

0 commit comments

Comments
 (0)