Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read data from dataframe #69

Closed
manycoding opened this issue Apr 17, 2019 · 0 comments
Closed

Read data from dataframe #69

manycoding opened this issue Apr 17, 2019 · 0 comments
Labels
Type: Feature New feature or request
Milestone

Comments

@manycoding
Copy link
Contributor

manycoding commented Apr 17, 2019

Support at least dataframe - which will allow to read the data locally from whatever source (csv, json, be it remote or local)

Currently the library relies on having _key to report items by it. So the implementation could look like:

  1. Figure out a simple api (fastai - datablock? like
items = Items.from_csv (items.from_job)
schema = Schema.get_schema(schema)
items.report_all(schema)

# And to keep it granular enough so it can be used in Spidermon
arche.rules.duplicates.find_by(items.df, ["name", "title"])
  1. Add _key column. Maybe it's easier to make _key as index if it's present and report index
  2. _type. So far _type nobody really needed it since we can use filters.
@manycoding manycoding added the Type: Feature New feature or request label Apr 17, 2019
@manycoding manycoding added this to the 0.4.0dev milestone Apr 17, 2019
@manycoding manycoding modified the milestones: 0.4.0, 0.3.3 May 3, 2019
@manycoding manycoding mentioned this issue May 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant