Select starting sheet in html/xls/sqlite from command-line #214

aborruso · 2018-11-24T15:18:58Z

Hi,
if I use vd -b t.html -o html.csv I have the table below in CSV and not the html table I have inside my file.

tag,id,nrows,ncols,classes
table,,72,4,wikitable sortable

table is the html table inside my t.html file. Is there a way to pass to the command line the sheet name? Something like vd -b t.html -sheet table -o html.csv

Thank you

The text was updated successfully, but these errors were encountered:

saulpw · 2018-11-24T15:39:34Z

Hi @aborruso, there's not an easy way yet. I've wanted something like this myself at times. Let me see if I can come up with something. Thanks for the suggestion!

aborruso · 2019-06-17T17:53:12Z

Hi @saulpw is there a way to open the table directly, when it is only one?

My final goal is to use VisiData as HTML to CSV converter with something like below in which I use an xpath query to extract only one table

curl "http://example.com/page.html" | myScrapeUtilty -xpathRule '//table[count(tr/td)>7]' | vd -b  -f html -o out.csv

But also with only one table visidata asks me to choose, and it saves as csv sheets sheet

Thank you

saulpw · 2019-06-18T05:54:40Z

Hi @aborruso, try adding -p dive.vd with the attached small .vd script.

sheet	col	row	longname	input	keystrokes	comment
			open-file	-	o	
-		0	dive-row		^J

The first command opens the input from stdin (-), and the second command dives into the first row (0).

You can get this .vd yourself with:

the same command you have but without -b
press Enter and do any other manual steps
press Shift+D to go to the commandlog
finally, press Ctrl+S to save to dive.vd, which you can use with your pipeline.

dive.vd.txt

aborruso · 2019-06-18T06:32:11Z

@saulpw you are really brilliant, I'm impressed VisiData is a kind of magic

aborruso · 2019-06-18T07:59:51Z

@saulpw I have added a recipe in my VisiData Italian guide https://github.com/ondata/guidaVisiData/blob/master/testo/README.md#Salvare-una-tabella-HTML-in-CSV-a-partire-da-una-pagina-web

Thank you againg

saulpw · 2019-08-21T03:38:20Z

Fixed for html loader in f55de38; requires changes in other loaders with a sheet index.

This permits indexing into sub-sheets through CLI (`+toplevel:subsheet::`).

saulpw · 2019-10-05T18:29:16Z

To-do to resolve this issue:

Fix loaders with sheet index to have rowdef sheets.
Write above requirement into book/loaders.md.
Improve startup with large files to remove sync(); file should load sync, cursor should jump after load completes (including ^C), or after sheet/row/col is available, if possible.

anjakefala · 2019-11-10T00:48:16Z

The IndexSheet has been developed (see visidata/sheets.py). It contains the attribute rowtype = 'sheets' on default.

Loaders to be ported:

Misc:

requirements needs to be added to loaders.md

saulpw · 2019-11-10T08:37:11Z

CLI syntax is +:<sheet>:<row>:<col>.

+:subsheet:: to ignore row/col
can name toplevel source index if more than one: +toplevel:subsheet::

aborruso · 2020-04-23T16:07:48Z

Hi @saulpw if I run

curl -L "https://en.wikipedia.org/wiki/Olympic_medal" | vd -f html +:table_2:1:1

vd does not open the table_e. What's wrong in my command?

vd 2 is really great!

anjakefala · 2020-04-23T16:38:39Z

Hey @aborruso!

Can you please open a bug report, and link to this issue?

There is not a good way for me to remember to check up on this potential bug, otherwise. 😅

aborruso changed the title ~~Ho to convert to HTML without visualization?~~ Ho to convert from HTML without visualization? Nov 24, 2018

aborruso changed the title ~~Ho to convert from HTML without visualization?~~ How to convert from HTML without visualization? Dec 2, 2018

saulpw changed the title ~~How to convert from HTML without visualization?~~ [Feature request] select starting sheet in html/xls/sqlite from command-line Dec 26, 2018

saulpw added the wishlist label Jan 11, 2019

saulpw changed the title ~~[Feature request] select starting sheet in html/xls/sqlite from command-line~~ Select starting sheet in html/xls/sqlite from command-line Jan 11, 2019

bknowles mentioned this issue May 15, 2019

Select starting table in postgres from command-line #282

Closed

anjakefala pushed a commit that referenced this issue Aug 22, 2019

[hdf5] SheetH5Obj rows now represent unloaded sheets #214

d3fda8d

This permits indexing into sub-sheets through CLI (`+toplevel:subsheet::`).

saulpw added a commit that referenced this issue Nov 12, 2019

[sqlite] update to iterload api #214

24e69d0

saulpw closed this as completed Nov 14, 2019

saulpw added the wish granted label Oct 27, 2020

aborruso mentioned this issue Nov 15, 2020

[VisiData] come ottenere una tabella (o più tabelle) da una pagina web, usando solo vd ? opendatasicilia/tansignari#164

Closed

jwhetzel-gpc mentioned this issue Apr 12, 2021

[sqlite] How to load a specific table via the command line? #949

Closed

anjakefala mentioned this issue Mar 23, 2024

[main] command-line argument +a:b is parsed as row:col, contradicting man page #2357

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Select starting sheet in html/xls/sqlite from command-line #214

Select starting sheet in html/xls/sqlite from command-line #214

aborruso commented Nov 24, 2018

saulpw commented Nov 24, 2018

aborruso commented Jun 17, 2019

saulpw commented Jun 18, 2019

aborruso commented Jun 18, 2019

aborruso commented Jun 18, 2019

saulpw commented Aug 21, 2019

saulpw commented Oct 5, 2019 •

edited

Loading

anjakefala commented Nov 10, 2019 •

edited

Loading

saulpw commented Nov 10, 2019

aborruso commented Apr 23, 2020

anjakefala commented Apr 23, 2020

Select starting sheet in html/xls/sqlite from command-line #214

Select starting sheet in html/xls/sqlite from command-line #214

Comments

aborruso commented Nov 24, 2018

saulpw commented Nov 24, 2018

aborruso commented Jun 17, 2019

saulpw commented Jun 18, 2019

aborruso commented Jun 18, 2019

aborruso commented Jun 18, 2019

saulpw commented Aug 21, 2019

saulpw commented Oct 5, 2019 • edited Loading

anjakefala commented Nov 10, 2019 • edited Loading

saulpw commented Nov 10, 2019

aborruso commented Apr 23, 2020

anjakefala commented Apr 23, 2020

saulpw commented Oct 5, 2019 •

edited

Loading

anjakefala commented Nov 10, 2019 •

edited

Loading