Skip to content

oneness/csvx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSVX

A dependency-free tool that enables you to control how to tokenize, transform and handle files with char(s) separated values.

Usage

Not sure if I am going to publish this to Clojars but if you are using tools.deps, you can just add following to deps.edn to add it to your project.

{:deps
  {github-oneness/csvx
    {:git/url "https://github.com/oneness/csvx"
     :sha "4d4ac868a0c6bdf9a5a6c32076303f1171c74db4"}}}
;; Then require from your repl like this:
(require '[csvx.core :as csvx])
;; You can just use the one public fn `readx` without any options to parse csv:
(csvx/readx "resources/100-sales-records.csv")
;; Note that csvx/readx takes optional arg where you can pass in following
;; options: (listed values here are defaults if no option is given See src/csvx/core.clj
;; for details).
{:encoding "UTF-8"
 :max-lines-to-read Integer/MAX_VALUE
 :line-tokenizer (fn [^String line]
		       (try (.split line ",")
			    (catch Exception e
			      (println :malformed-line line)
			      ;; (strace/print-stack-trace e) ;; handle exception here
                          nil)))
 :line-transformer #(map-indexed hash-map %)}

Say you have a giant json file that you need to parse into Clojure map, all you need to do is to to write line tokenizer and transformer like this to get it working from repl. Remember to pass in how many lines you want to read in while you are experimenting with your repl as not to blow it up:

(defn decode-json [^String file-path]
  (readx file-path ;;"sample.json"
	   {:max-lines-to-read 1
	    :line-tokenizer (fn [line]
			      (map #(.split ^String % ":")
				   (-> (clojure.string/replace line #"\{|\}" "")
				       (.split ","))))
	    :line-transformer (fn [line]
				(reduce (fn [acc [k v]]
					  (merge acc
						 {(-> k read-string keyword) (read-string v)}))
					{}
					line))}))

About

A dependency-free tool that enables you to control how to tokenize, transform and handle files with char(s) separated values.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published