This is a grab-bag of stuff I created "just because".
grep and json_funcs are superseded by "gorpy", my newer project. Don't bother with them.
In order of perceived importance:
-
When you're using pandas.read_excel to parse an Excel workbook or similar, you will often find a single table that should properly be broken up into multiple tables.
split_into_subframes.split_into_subframes breaks down pandas DataFrames into sub-tables so that none of them have any empty rows or columns, which can often help with sheets that contain multiple tables.
-
Utilities for converting strings representing ranges of numbers (e.g., ["1-10", "$10 to $1,000"]) into actual numeric ranges.
For example, ["Under 10 years", "10 years old", "11-20", "21-40", "over 40"] could be correctly parsed as the ranges [(0, 10), (10, 11), (11, 20), (21, 40), (40, inf)]
-
When you're working with census data or similar, you often come across data that is binned and you have no access to the underlying numbers.
This allows you to increase the data's bin size, so that you can easily go from, say, 10-year age increments to 20-year age increments.
-
Given a JSON or YAML document that contains dicts with multiple instances of the same key, creates a new Python object where each multiple-key is mapped to a list of the values it was assigned to.
-
numpy_to_latex converts a numpy array into a syntactically correct
numpy_array_from_string reads any string representation of a numpy array back to the original array, with or without commas.
-
- be called from the command line
- search for filenames (which can include non-text files) and directory names matching regexes
- use pipes to successively refine file searches
- Write results to JSON files
- Order results by (and view) last modification time or file size
- Limit number of files returned (useful when ordering by mod time or size)
- Automatically open all files found by the search (Windows only)
-
One pretty-prints only the keys of dicts (not the values), which can prove handy when JSON has a lot of long values (e.g., tweets).
Another extracts JSON that lies along a partially specified path. This is my JSON-extraction function. There are many like it, but this one is mine!