Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to do a simple transposed view of csv ? #321

Open
ashmishr opened this issue Apr 6, 2020 · 12 comments
Open

Is there a way to do a simple transposed view of csv ? #321

ashmishr opened this issue Apr 6, 2020 · 12 comments

Comments

@ashmishr
Copy link

ashmishr commented Apr 6, 2020

I often have to query time series data-warehouses with over 500 columns . I tried xtab and reshape but cannot get a simple transform of just rotating my laptop by 90 degree.
Generally the time range is a week and we need to see all the values side by side to observe how its changing, so json etc is not right. Any help is appreciated.

@aborruso
Copy link
Contributor

aborruso commented Apr 6, 2020

Starting from this input

w,field1,field2,field3
a,4,6,3
b,5,4,2

and running

mlr --csv reshape -r "[^w]" -o item,value input.csv

I have this

+---+--------+-------+
| w | item   | value |
+---+--------+-------+
| a | field1 | 4     |
| a | field2 | 6     |
| a | field3 | 3     |
| b | field1 | 5     |
| b | field2 | 4     |
| b | field3 | 2     |
+---+--------+-------+

Is not what you want? If no, could you insert an example input and an example output?

@ashmishr
Copy link
Author

ashmishr commented Apr 7, 2020

Thanks for answering. I need a simple rotate left /transpose for example ,

Input csv is 4 column and 5 rows, with h1..4 being headers

h1 ,h2, h3, h4
r1v1,r1v2,r1v3,r1v4
r2v1,r2v2,r2v3,r2v4
r3v1,r3v2,r3v3,r3v4
r4v1,r4v2,r4v3,r4v4
r5v1,r5v2,r5v3,r5v4


Output is just like I rotated it left 90 degree.

h1 r1v1 r2v1 r3v1 r4v1 r5v1
h2 r1v2 r2v2 r3v2 r4v2 r5v2
h3 r1v3 r2v3 r3v3 r4v3 r5v3
h4 r1v4 r2v4 r3v4 r4v4 r5v4


This is very useful when seeing changes in v1 over all the very small number of rows in dataset because when the number of columns or headers is in hundreds the overlapped pretty print is hard to make sense of. Maybe if there's a verb called rotate or transpose that would be very useful.

@johnkerl
Copy link
Owner

johnkerl commented Apr 7, 2020

@ashmishr nice!

At present what I have is the XTAB output format as you noted ...

Decades ago I wrote this: https://github.com/johnkerl/scripts/blob/master/fundam/xpose

$ cat example.csv
h1,h2,h3,h4
r1v1,r1v2,r1v3,r1v4
r2v1,r2v2,r2v3,r2v4
r3v1,r3v2,r3v3,r3v4
r4v1,r4v2,r4v3,r4v4
r5v1,r5v2,r5v3,r5v4

$ mlr --c2p cat example.csv | xpose
h1 r1v1 r2v1 r3v1 r4v1 r5v1
h2 r1v2 r2v2 r3v2 r4v2 r5v2
h3 r1v3 r2v3 r3v3 r4v3 r5v3
h4 r1v4 r2v4 r3v4 r4v4 r5v4

$ mlr --c2p cat example.csv | xpose | left
h1 r1v1 r2v1 r3v1 r4v1 r5v1
h2 r1v2 r2v2 r3v2 r4v2 r5v2
h3 r1v3 r2v3 r3v3 r4v3 r5v3
h4 r1v4 r2v4 r3v4 r4v4 r5v4

where https://github.com/johnkerl/scripts/blob/master/fundam/left is nice in case the cell values aren't all the same length (even though they are in this example).

@ashmishr
Copy link
Author

ashmishr commented Apr 7, 2020

@johnkerl thanks. you forced me to write a one liner that I was resisting 👍

$cat example.csv | python -c 'import sys,pandas; print((pandas.read_csv(sys.stdin)).transpose().to_string())'


Or couple of lines in my shell profile so its invocable as a command from sqsh etc.

function xpose { python -c 'import sys,pandas; print((pandas.read_csv(sys.stdin)).transpose().to_string())'; }
export -f xpose;

But I hope you would please consider making it a native feature :)

@aborruso
Copy link
Contributor

aborruso commented Apr 7, 2020

@johnkerl thanks. you forced me to write a one liner that I was resisting 👍

There is a cli to do that: https://bioinf.shenwei.me/csvtk/usage/#transpose

@johnkerl
Copy link
Owner

Thanks @ashmishr! Regarding native feature ... I'm not sure this makes sense in Miller which is (uniquely) keyed by field name ...

@johnkerl
Copy link
Owner

See also #688

@johnkerl
Copy link
Owner

Closing as dup of #688 -- please re-open if I'm mistaken -- thanks! :)

@Fravadona
Copy link

But I hope you would please consider making it a native feature :)

Here's a simple way to transpose a CSV with Miller:

mlr --csv -N put -q '
    for (k,v in $*) { @arr[k][NR] = v }
    end { emit @arr }
' file.csv

@aborruso
Copy link
Contributor

aborruso commented Dec 7, 2022

You are great @Fravadona and I would need to learn to use dsl better. I am truly a goat

@dmkuzsib
Copy link

dmkuzsib commented Dec 23, 2022

there is also datamash

$ echo "1 2 3 4 5"|datamash --field-separator ' ' transpose

@coolbq
Copy link

coolbq commented Jul 11, 2023

Thanks @ashmishr! Regarding native feature ... I'm not sure this makes sense in Miller which is (uniquely) keyed by field name ...

Actually, there are some cases we need to transpose the data file to be row-based csv, because it is origianlly a vertical csv which arrange time series data vertically. The headers are placed vertically in the first column. This means a transpose is a must before Miller do other processing.
For example, you can download this file https://www.bis.org/statistics/full_eer_d_csv_row.zip, it is from BIS (Bank for International Settlements), this is the exact case and they have some other files with this format. You could check their file list at https://www.bis.org/statistics/full_data_sets.htm

So absolutely agree with @ashmishr that transpose should be a native verb.

@johnkerl johnkerl reopened this Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants