Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INFRA] Close stale PRs #18780

Closed
wants to merge 1 commit into from
Closed

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Jul 31, 2017

What changes were proposed in this pull request?

This PR proposes to close stale PRs, mostly the same instances with #18017

Closes #14085 - [SPARK-16408][SQL] SparkSQL Added file get Exception: is a directory …
Closes #14239 - [SPARK-16593] [CORE] [WIP] Provide a pre-fetch mechanism to accelerate shuffle stage.
Closes #14567 - [SPARK-16992][PYSPARK] Python Pep8 formatting and import reorganisation
Closes #14579 - [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() should return Python context managers
Closes #14601 - [SPARK-13979][Core] Killed executor is re spawned without AWS key…
Closes #14830 - [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on Pyspark examples
Closes #14963 - [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in lint-python
Closes #15227 - [SPARK-17655][SQL]Remove unused variables declarations and definations in a WholeStageCodeGened stage
Closes #15240 - [SPARK-17556] [CORE] [SQL] Executor side broadcast for broadcast joins
Closes #15405 - [SPARK-15917][CORE] Added support for number of executors in Standalone [WIP]
Closes #16099 - [SPARK-18665][SQL] set statement state to "ERROR" after user cancel job
Closes #16445 - [SPARK-19043][SQL]Make SparkSQLSessionManager more configurable
Closes #16618 - [SPARK-14409][ML][WIP] Add RankingEvaluator
Closes #16766 - [SPARK-19426][SQL] Custom coalesce for Dataset
Closes #16832 - [SPARK-19490][SQL] ignore case sensitivity when filtering hive partition columns
Closes #17052 - [SPARK-19690][SS] Join a streaming DataFrame with a batch DataFrame which has an aggregation may not work
Closes #17267 - [SPARK-19926][PYSPARK] Make pyspark exception more user-friendly
Closes #17371 - [SPARK-19903][PYSPARK][SS] window operator miss the watermark metadata of time column
Closes #17401 - [SPARK-18364][YARN] Expose metrics for YarnShuffleService
Closes #17519 - [SPARK-15352][Doc] follow-up: add configuration docs for topology-aware block replication
Closes #17530 - [SPARK-5158] Access kerberized HDFS from Spark standalone
Closes #17854 - [SPARK-20564][Deploy] Reduce massive executor failures when executor count is large (>2000)
Closes #17979 - [SPARK-19320][MESOS][WIP]allow specifying a hard limit on number of gpus required in each spark executor when running on mesos
Closes #18127 - [SPARK-6628][SQL][Branch-2.1] Fix ClassCastException when executing sql statement 'insert into' on hbase table
Closes #18236 - [SPARK-21015] Check field name is not null and empty in GenericRowWit…
Closes #18269 - [SPARK-21056][SQL] Use at most one spark job to list files in InMemoryFileIndex
Closes #18328 - [SPARK-21121][SQL] Support changing storage level via the spark.sql.inMemoryColumnarStorage.level variable
Closes #18354 - [SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation: Constant Pool Limit - Class Splitting
Closes #18383 - [SPARK-21167][SS] Set kafka clientId while fetch messages
Closes #18414 - [SPARK-21169] [core] Make sure to update application status to RUNNING if executors are accepted and RUNNING after recovery
Closes #18432 - resolve com.esotericsoftware.kryo.KryoException
Closes #18490 - [SPARK-21269][Core][WIP] Fix FetchFailedException when enable maxReqSizeShuffleToMem and KryoSerializer
Closes #18585 - SPARK-21359
Closes #18609 - Spark SQL merge small files to big files Update InsertIntoHiveTable.scala

Added:
Closes #18308 - [SPARK-21099][Spark Core] INFO Log Message Using Incorrect Executor I…
Closes #18599 - [SPARK-21372] spark writes one log file even I set the number of spark_rotate_log to 0
Closes #18619 - [SPARK-21397][BUILD]Maven shade plugin adding dependency-reduced-pom.xml to …
Closes #18667 - Fix the simpleString used in error messages
Closes #18782 - Branch 2.1

Added:
Closes #17694 - [SPARK-12717][PYSPARK] Resolving race condition with pyspark broadcasts when using multiple threads

Added:
Closes #16456 - [SPARK-18994] clean up the local directories for application in future by annother thread
Closes #18683 - [SPARK-21474][CORE] Make number of parallel fetches from a reducer configurable
Closes #18690 - [SPARK-21334][CORE] Add metrics reporting service to External Shuffle Server

Added:
Closes #18827 - Merge pull request 1 from apache/master

How was this patch tested?

N/A

@HyukjinKwon
Copy link
Member Author

cc @srowen, @jiangxb1987, @gatorsmile, @ueshin, @jerryshao, @vanzin, @cloud-fan and @MLnick who I believe are interested in this and could double check this list.

@gatorsmile
Copy link
Member

Before closing these PRs, we need to figure out whether they should be taken over by someone else. If we simply close them, we might forget to fix the related issues.

@HyukjinKwon
Copy link
Member Author

@gatorsmile, I think we still could find others even after being closed if the point is to prevent forgetting. This list remains and there are many in-progress JIRAs we could find someone to work on.

@gatorsmile
Copy link
Member

If the PRs are closed, we might not read them any more. Thus, my suggestion is be careful when you want to close the PRs.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Jul 31, 2017

Is the point to leave them open to make reviewers read? I think we already had few discussion about this. The (I think) last resort, automatic closing, was already suggested before - @rxin. Also, see #13959.

I personally don't like to close them but this is my effort to prevent the last resort. I think we can identify in-progress JIRAs.

@gatorsmile
Copy link
Member

The number of JIRAs is too many. To be honest, I do not have a bandwidth to go over all the in-progress JIRAs. I suggest to review the inactive PRs one more time to ensure we do not miss any good/important PR.

@gatorsmile
Copy link
Member

BTW, I suggest to let the committers who knew the component very well to give the suggestion to close the PRs. For example, I will not suggest to close any ML PR.

@HyukjinKwon
Copy link
Member Author

@gatorsmile, Sure, I will take them out if any ML committer suggests. cc @jkbradley, @yanboliang and @mengxr who I know. Will also take a look and check them as far as I can once more.

Also, what do you think that I send an email to user/dev mailing list to make other contributors pick up PRs they are used to?

@gatorsmile
Copy link
Member

First, we need to identify a list of PRs that need to be taken over. For example, in SQL PRs, #18515 should be fixed, even if the original PR creator might not do it.

@HyukjinKwon
Copy link
Member Author

But wouldn't we still be able to look though the PRs and make the list even after being closed here? I think I am lost about why we should leave them open.

@gatorsmile
Copy link
Member

gatorsmile commented Jul 31, 2017

I will go over the SQL-related PRs in the next few days to identify which PRs should not be closed and we should find somebody else to take them over.

@HyukjinKwon
Copy link
Member Author

I guess It should be good to identify important and good PRs but I am sure most of them were inactive to review comments or Jenkins failures more than a month and I left a comment to check if they are still inactive.

I think if the author is willing to proceed, they can easily reopen. If we find someone to take over, we can simply cc.

@jerryshao
Copy link
Contributor

@HyukjinKwon , can you please also add the closed PR title closes #xxx xxxxxxxx, that would be easier for us to identify, otherwise we need to click on by one, thanks!

@HyukjinKwon
Copy link
Member Author

Sure, let me try soon.

@HyukjinKwon
Copy link
Member Author

cc @mateiz and @marmbrus, do you mind if I ask your opinion on this? I want to be sure on this and see if this can be justified in general because I have made such PRs periodically. I will stop doing this if any of you has a different idea. Please let me know.

@rxin
Copy link
Contributor

rxin commented Jul 31, 2017

If you are asking for their opinions it'd be easier if you ask more explicitly (A vs B) in one comment, rather than asking them to go through and read the entire thread ...

@SparkQA
Copy link

SparkQA commented Jul 31, 2017

Test build #80068 has finished for PR 18780 at commit 7ccc4ba.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Jul 31, 2017

Sure, I guess these should be:

A: Closing stale PRs, which are inactive to Jenkins test failures and review comments more than a month with a one-weak comment to check if it is really inactive, regardless of importance or goodness of the PRs.

B: Closing stale PRs that are carefully identified, in terms of goodness and importance of the PRs, by committers, regardless of being inactivity (mainly from the author).

@srowen
Copy link
Member

srowen commented Jul 31, 2017

@gatorsmile @tdas to rehash the logic here:

  • JIRAs track 'what' needs to happen, and PRs track a 'how' it could be fixed. Closing a PR shouldn't necessarily mean an important issue is forgotten.
  • The PR remains visible and open for comment
  • PRs can be reopened at any time, or forked, by anyone
  • There's too much to review, yes, and open stale PRs harm visibility into active issues
  • This is indeed a process right here for calling attention to a few issues that need a look; thanks for looking
  • Just say the word if there's any doubt about a PR and it will stay open. This is a lightweight reaping process

I'd be careful of falling into the trap of thinking that, because something looks important, but nobody has thought it important enough to look at in a long time, that leaving it open somehow increases the chance that Someone will work on it. It has never worked that way in the history of software projects.

I personally would rather close more aggressively, and if that looks like bad news, address why all these important things aren't getting done: need more committers? discouraged contributors? Too much complexity? how to get long-time contributors to become more active again?

This mechanism is just a messenger. But still let's just stick to closing the ones that nobody objects to.

@srowen
Copy link
Member

srowen commented Jul 31, 2017

@HyukjinKwon I would add:

#18308
#18599
#18619
#18667
#18782

18475 is already closed actually.

@HyukjinKwon
Copy link
Member Author

Sure, I added those and took out 18475.

@marmbrus
Copy link
Contributor

I'm in favor of fairly aggressive closing of inactive pull requests. The cost of reopening them is small, and the benefits of having a clear view of what is in progress are large. I think as long as the notification is clear and polite, it won't discourage contribution.

@vanzin
Copy link
Contributor

vanzin commented Jul 31, 2017

+1 to what Michael said. List seems fine to me (YARN-related issues).

You have a typo in the PR title though.

@HyukjinKwon HyukjinKwon changed the title [INTRA] Close stale PRs [INFRA] Close stale PRs Jul 31, 2017
@gatorsmile
Copy link
Member

Please take [SPARK-21287] out.

@HyukjinKwon
Copy link
Member Author

Yes, I just took out.

@gatorsmile
Copy link
Member

After we leave polite messages to close their PRs, I think we should still keep them open one more week at least. Although it is trivial to reopen it by themselves, the feelings are different.

@srowen
Copy link
Member

srowen commented Aug 2, 2017

@gatorsmile most or all of these have had pings on the PR already; Hyukjin generally pings them if there's a question then tries to close 1-2 weeks later. I agree, I wouldn't surprise anyone with a close, but it seemed like they all were clearly left with the contributor or the reason it should be closed was already given

@jiangxb1987
Copy link
Contributor

#11494 is still useful, and I guess we will review it in the near future, so please take it out.

I would also add the following:
#16456
#18683
#18690

@HyukjinKwon
Copy link
Member Author

I added those and took out 11494.

@HyukjinKwon
Copy link
Member Author

Are we okay to go ahead with the current list?

@srowen
Copy link
Member

srowen commented Aug 5, 2017

I would go ahead. Anyone can reopen something if needed.

@asfgit asfgit closed this in 3a45c7f Aug 5, 2017
@HyukjinKwon
Copy link
Member Author

Thanks all. I merged this to master.

@KevinLefevre
Copy link

Hello, sorry I'm quite new, where can I DL the patch, i have an issue that has been corrected with #18127 . Do I have to DL the master branch and recompile the JAR and replace in my maven .m2 ?

@HyukjinKwon HyukjinKwon deleted the close-prs branch January 2, 2018 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
10 participants