Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "Prepend rows" command from gokbutils #1855

Open
ostephens opened this issue Nov 22, 2018 · 18 comments · May be fixed by #6461
Open

Implement "Prepend rows" command from gokbutils #1855

ostephens opened this issue Nov 22, 2018 · 18 comments · May be fixed by #6461
Labels
extension Making it easier to extend OpenRefine's UI and backend gsoc/outreachy Projects proposed for internships. Please hold back from these tasks if you are not elligible. Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.

Comments

@ostephens
Copy link
Sponsor Member

ostephens commented Nov 22, 2018

The gokbutils extension includes a command which adds blank rows to the start of a project. This is a request to implement that command into the core product using the same implementation as currently in the gokbutils

https://github.com/ostephens/refine-gokbutils

@thadguidry thadguidry added the extension Making it easier to extend OpenRefine's UI and backend label Nov 28, 2018
@tfmorris tfmorris added the Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. label Jun 28, 2020
@ToVie
Copy link

ToVie commented Jul 24, 2021

Hi
is there a reason, why this enhancement is not pursued?
Best regards
Tobias

@wetneb
Copy link
Sponsor Member

wetneb commented Jul 24, 2021

Hi @ToVie, probably simply because nobody found the motivation or time to work on it. Personally my priority is to release a stable 3.5 release at the moment.

@ToVie
Copy link

ToVie commented Jul 25, 2021

Hi @wetneb , ok - I thought it might had technical reasons. But I get your priority. Using 3.5beta1 I find it stable and reliable. Though it chokes on URLs with uncoded blanks - should I write an issue?

@wetneb
Copy link
Sponsor Member

wetneb commented Jul 25, 2021

Yes please, that would be great!

@Ishankoradia
Copy link

Ishankoradia commented Feb 24, 2024

Hey @wetneb , I had like to work on this. I have gone through the contribution guidelines and have also set up the dev environment on my local pc & have openrefine (3.9) up & running. Could you assign this to me ?

From what i have understood is the command class AddRowsCommand in the extension needs to be migrated to src/com/google/refine/commands . Right ?
I am a little confused with the UI/UX. where would this command option go in the UI ? Is it under All ?

@wetneb wetneb added the gsoc/outreachy Projects proposed for internships. Please hold back from these tasks if you are not elligible. label Feb 24, 2024
@wetneb
Copy link
Sponsor Member

wetneb commented Feb 24, 2024

Hi @Ishankoradia, thanks a lot for the interest!
This is a task that I have proposed as an internship subject for GSoC. So intuitively I would rather reserve it for that.
Is your interest in this issue related to GSoC at all?

That being said, I think my estimate of 175h for this project is on the very generous side, especially if the design is already specified ahead of time in the issue. It vastly depends on the proficiency of the person tackling the task, of course, but I think there is also a case to be made that this issue should be available for the initial contribution period rather than as an internship subject. Even then, that wouldn't make it a good first issue either: I would encourage you to start with something smaller.

To answer your questions: yes, this task would involve writing a new Command, as well as a new Operation, and possibly a new Change (I haven't checked if an existing one can be reused).
For the UI, adding the operation under All would make sense to me.

@Ishankoradia
Copy link

Hey @wetneb , yes I was lead here from the GSoc project ideas page. I am planning to apply this year and would like to undertake this as my project idea. This looks challenging enough for me.

I will take a closer look at the code based to understand the idea behind this task. Do you recommend any smaller issues for me to start with ? Just to get my hands dirty and as a warm up.

@wetneb
Copy link
Sponsor Member

wetneb commented Feb 24, 2024

Yes, we specifically maintain a list of good first issues.

@tfmorris
Copy link
Member

This is a request to implement that command into the core product using the same implementation as currently in the gokbutils

That implementation uses MassRowChange which is a pretty resource hungry approach, basically copying the entire project.. I would suggest creating a RowInsertChange along the lines of RowRemovalChange.

Extending the implementation to allow the specification of an insertion point (ie insert new rows before/after row N) would allow more flexibility for users who want their new rows at the end or somewhere else.

@zyadtaha
Copy link
Contributor

zyadtaha commented Mar 9, 2024

Hi @wetneb, I have set up the environment and want to start contributing and solving the good first issues as you advised.
I am also interested in "New operation to add blank rows" Project in GSoC 2024.
Is it still available or assigned to someone?

@wetneb
Copy link
Sponsor Member

wetneb commented Mar 9, 2024

Hi @zyadtaha, this issue will only be assigned after someone successfully applies to do a GSoC internship on it (via the GSoC portal). Looking forward to reading your application there!

@Ahmed-Elgamel
Copy link
Contributor

Hi @wetneb ,I have read openRefine's GSoC projects but this one particularly interests me.Is the feature requireed to add only one row or given a number it appends this many rows?

@lokendra-singh-rao
Copy link

Hey @wetneb, I am coming from gsoc. It's my first time to gsoc but my skills help me believe that I can do it for sure. But I have some doubts, as I can see from the above discussion that many people are interested in this, will it be given to one only or more than one ?

@wetneb
Copy link
Sponsor Member

wetneb commented Mar 10, 2024

@Ahmed-Elgamel as you can see in the discussion above and the title of the issue, the goal is to be able to add multiple rows at once.

@lokendra-singh-rao at most one applicant will be selected to work on any given topic

@steve-kasica
Copy link

steve-kasica commented Mar 12, 2024

I just stumbled upon this issue during the community meeting today, and I actually wrote some backend code a month ago that provides a proof of concept or starting point for this feature request. I just pushed it in a branch of my OpenRefine fork. I wrote it as part of an extension I'm making. I wasn't aware of the gokbutils extension until today. Coincidentally, I followed RowRemovalChange as a guide, as @ostephens suggested.

I'm not sure if GSoC makes sense for me, but happy to be involved with whoever gets assigned this topic/issue. It fits into my multi-table data wrangling focus because in the last paper I pushed we found users who wrangle data with Python or R often have to use a UNION-like operation to insert just a few rows, which is kind of overkill. So I'm a fan of seeing this feature implemented in OpenRefine core.

@tfmorris
Copy link
Member

That looks great. I didn't review in detail, but it looks like it is basically done except for the addition of unit tests which doesn't seem like enough work to be the basis for a GSoC project. Would you be willing to add a test and submit it as a PR?

A couple of things caught my eye. This comment

// It's not technically an engine dependent command, but it is easier to write
// by extending EngineDependentCommand than Command

suggests that perhaps we should refactor this class hierarchy. Can you expand on your comment? (Perhaps in the form of an issue requesting one or more refactorings, like hoisting useful methods into the superclass)

because in the last paper I pushed we found ...

Is this paper available someplace?

@steve-kasica
Copy link

Thanks @tfmorris! I'd be happy to test it and submit it as a PR. The only testing I've done is a smoke test on my machine with Postman. A little code review from someone with more experience with OpenRefine's source code would be beneficial for me too. I'm working on more commands that follow this pattern, e.g. aggregating rows, interpolating rows, copying projects, splitting projects, joining projects.

With regards to that comment, I just started diving into the OpenRefine codebase over the last month or two. So maybe I'm doing things completely wrong, and please don't hesitate to tell me if I am. But, now that I think about it a little more, I'm leaning towards refactoring that command to extend Command for consistency. But I still posted it as an issue for discussion, #6443.

I've published two papers on users' patterns and tasks when wrangling data. I can't remember which one that observation came from or if its a synthesis of both. But those papers, with recordings of video talks, are:

@tfmorris
Copy link
Member

Thanks for the papers! I'll add them to the reading list. Thanks for creating the issue for the refactoring discussion as well. I've added a draft PR there for discussion

On closer examination of your code, it looks like it doesn't just create empty rows, but has provisions for adding new rows based on data from the front end. What does the UI look like for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Making it easier to extend OpenRefine's UI and backend gsoc/outreachy Projects proposed for internships. Please hold back from these tasks if you are not elligible. Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants