Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determining column count on file datasets #1374

Closed
joohokim1 opened this issue Feb 1, 2019 · 9 comments
Closed

Determining column count on file datasets #1374

joohokim1 opened this issue Feb 1, 2019 · 9 comments
Assignees
Labels
@dataprep Component Name : Data preparation enhancement Request Change and Feature Enhancement
Milestone

Comments

@joohokim1
Copy link
Contributor

Is your feature request related to a problem? Please describe.
If the column count varys row by row, it's difficult to choose how to determine the column count.

Candidates was below:

  1. Just the 1st row rules. (datasource way)
  2. Visit all the rows, and choose the biggest. (dataprep way)

Describe the solution you'd like

  1. Visit all rows up to upload preview limit, and choose the biggest.
  2. Show to user, and the text control editable. (Not commonly)

Describe alternatives you've considered

  1. Just the 1st row rules. (datasource way)

Additional context
finefood.sample.txt
If you import and wrangle (use colon as delimiter) the file above, you'll see 12 columns in upload preview, 34 column in I.DS preview. The bigger file (not sample) results 55 columns. (Someone used http references or just so many columns)

@joohokim1 joohokim1 added enhancement Request Change and Feature Enhancement @dataprep Component Name : Data preparation labels Feb 1, 2019
@joohokim1 joohokim1 added this to the 3.2.0 milestone Feb 1, 2019
@joohokim1 joohokim1 self-assigned this Feb 1, 2019
@minjung-cho
Copy link

스크린샷 2019-03-13 오후 4 30 47
@joohokim1 @AnnieHwang 첨부한 안으로 요구사항이 해결될까요? 검토 부탁드립니다-

@minjung-cho minjung-cho added the awaiting feedback need to feedback label Mar 13, 2019
@joohokim1
Copy link
Contributor Author

joohokim1 commented Mar 14, 2019

@minjung-cho @AnnieHwang

위 장표 내용 중에, paragraph로 행이 구분되고, period로 컬럼이 구분된다고 되어있는데, 이것은 미스 커뮤니케이션 인 것 같습니다. (아마 제가 unstructured data를 논하면서 했던 얘기인 것 같습니다.) 삭제하셔도 될 것 같습니다.

화면 디자인은 좋습니다. 오른쪽 설명 수정 후 (다른 의견이 없으시면) 바로 진행하시면 될 것 같습니다.

참고로 장표에 언급된 "데이터에 기반하여 자동으로 행과 컬럼이 구분됨"은 #1654 로 진행됩니다.

@joohokim1 joohokim1 removed the awaiting feedback need to feedback label Mar 14, 2019
@minjung-cho
Copy link

@joohokim1 네, 말씀하신 기획서 오류는 기획 파일에서 삭제해 두겠습니다.

@koeun222
Copy link

@AnnieHwang 디자인 공유드립니다

  1. column count 추가 : CSV,TXT,Excel 에만 보임
  2. 에러 위치 수정
    image

@AnnieHwang
Copy link
Contributor

@minjung-cho Column Count라는 레이블이... 뭔지 잘 이해가 안가는데... 좀 더 사용자가 알기쉽도록 명확하게 해줄 필요가 있을것 같아요

@minjung-cho minjung-cho added the awaiting feedback need to feedback label Mar 15, 2019
@minjung-cho
Copy link

스크린샷 2019-03-15 오후 3 43 32

@AnnieHwang 조금전 논의한 내용 반영했습니다. 확인 부탁드립니다- @joohokim1

@joohokim1
Copy link
Contributor Author

@minjung-cho @AnnieHwang

네 확인했습니다. 아주 좋습니다~

@joohokim1 joohokim1 removed the awaiting feedback need to feedback label Mar 15, 2019
@koeun222
Copy link

koeun222 commented Mar 15, 2019

@AnnieHwang 변경된내용 적용한 디자인 공유드립니다-
image
image

@kaypark-skt kaypark-skt self-assigned this Mar 22, 2019
joohokim1 pushed a commit that referenced this issue Mar 25, 2019
* #1374 add manual Column Count on file datasets(CSV, EXCEL)

* #1374 using manualColumnCount of PrDataset
@joohokim1
Copy link
Contributor Author

Closing as completed.

knockknockyoo pushed a commit that referenced this issue Mar 27, 2019
* #1374 add manual Column Count on file datasets(CSV, EXCEL)

* #1374 using manualColumnCount of PrDataset
joohokim1 pushed a commit that referenced this issue Apr 3, 2019
* #1223 Show full dsName and desc with browser tooltip (union popup)

* #1223 chart expression update
- add tooltip
- add tooltip util(each line displays twenty words at maximum in two lines)
- add abbreviated icon title
- change icon symbol size to 55
- replace icon images by svg format

* #1223 Delete snapshot type column from list

* #1223 chart expression update
- typo in svg file
- add svg icons

* #1463 support spatial operation

* #1693 Show file upload location when creating dataset with file (#1710)

* #1693 show file upload location

* #1693 Modify to location when uploading starts

* #1681 name after W.DS not I.DS (#1712)

* #1374 determining column count on file datasets

* #1374 add manual Column Count on file datasets(CSV, EXCEL)

* #1374 using manualColumnCount of PrDataset

* #1049 Add linked source geo type preview message

* Set new version to 3.2.0-rc4

* #1049 add icon type filter in storage select box component

* #1049 linked geo type message

* #1049 add schema config data preview component

* #1049 derived field preview message in datasource grid field

* #1049 add derived field no preview in create datasource config step

* #1049 add linked source guide message in datasource detail grid component

* #1049 add linked source guide message in dashboard data preview component

* #1049 add css data preview none class

* #1049 fix css dashboard data preview component

* #1049 fix alias error in hive connection

* #1049 import StringUtil

* fn-fix snapshot data preview field length in datasource

* fn-add support dimension type : TEXT

* #1717 Fix ruleIdx when sending rename transform api

- When rename popup is opened via snapshot create popup and is in edit mode

* 1727-Add file name in file datasource detail page

* fn add file name in create file datasource

* fn-uploadFileName property added

* fn change ingestion property in datasource create step

* #1711 Embedded chart implementation error when parameter is long (#1718)

* #1223 Show full dsName and desc with browser tooltip (union popup)

* #1223 chart expression update
- add tooltip
- add tooltip util(each line displays twenty words at maximum in two lines)
- add abbreviated icon title
- change icon symbol size to 55
- replace icon images by svg format

* #1223 Delete snapshot type column from list

* #1223 chart expression update
- typo in svg file
- add svg icons

* #1223 tooltip expression and icon rearrangement
- add tooltip util
- change tooltip expression(each line displays twenty words at maximum)
- fix height, only allow resizing horizontally
- replace chart images by svg format

* #1223 Change dataset default name
- use .extension for file types
- _databasename for database and staging
- update language file
- update svg icons

* #1223 Treat txt as csv when dealing with delimiter, fix typo, fix getFileFormat logic (prep common util)

* #1223 Add deleted code from merging (show error msg when hdfs upload fails)

* #1223 Fixe column delimiter error due to .txt extension in file

* #1223 Show svg icon in dataset list.
- Add icon in list
- Show type of database eg. mysql, hive in dataset detail page

* #1223 Add svg icon in dataset list, change source/type label in dataset list and detail

* #1223 Fix type information in add dataset popup

* #1223 Fix getDatasetType method in dataprep common util

* #1223 Add svg icon in last step of dataset creation

* #1223 Add druid svg icon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@dataprep Component Name : Data preparation enhancement Request Change and Feature Enhancement
Projects
None yet
Development

No branches or pull requests

6 participants