Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-40838][INFRA][TESTS] Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest #38304

Closed
wants to merge 1 commit into from

Conversation

Yikun
Copy link
Member

@Yikun Yikun commented Oct 19, 2022

What changes were proposed in this pull request?

Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest

Why are the changes needed?

  • Upgrade infra base image to focal-20220922 (Ubuntu 20.04 currently latest)

  • Infra Image Python version updated.

    • numpy 1.23.3 --> 1.23.4
    • mlflow 1.28.0 --> 1.29.0
    • matplotlib 3.5.3 --> 3.6.1
    • pip 22.2.2 --> 22.3
    • scipy 1.9.1 --> 1.9.3

    Full list: https://www.diffchecker.com/e6eZZaYn

  • Fix ps.mlfow doctest (due to mlflow upgrade):

**********************************************************************
File "/__w/spark/spark/python/pyspark/pandas/mlflow.py", line 158, in pyspark.pandas.mlflow.load_model
Failed example:
    with mlflow.start_run():
        lr = LinearRegression()
        lr.fit(train_x, train_y)
        mlflow.sklearn.log_model(lr, "model")
Expected:
    LinearRegression(...)
Got:
    LinearRegression()
    <mlflow.models.model.ModelInfo object at 0x7fef9578deb0>

Does this PR introduce any user-facing change?

No, dev only

How was this patch tested?

All CI passed

@Yikun Yikun marked this pull request as ready for review October 20, 2022 07:52
@Yikun
Copy link
Member Author

Yikun commented Oct 20, 2022

@HyukjinKwon Thanks, will merge to master

@Yikun Yikun closed this in 2698d6b Oct 20, 2022
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…2 and fix ps.mlflow doctest

### What changes were proposed in this pull request?
Upgrade infra base image to focal-20220922 and fix ps.mlflow doctest

### Why are the changes needed?
- Upgrade infra base image to `focal-20220922` (Ubuntu 20.04 currently latest)
- Infra Image Python version updated.
  - numpy 1.23.3 --> 1.23.4
  - mlflow 1.28.0 --> 1.29.0
  - matplotlib 3.5.3 --> 3.6.1
  - pip 22.2.2 --> 22.3
  - scipy 1.9.1 --> 1.9.3

  Full list: https://www.diffchecker.com/e6eZZaYn
- Fix ps.mlfow doctest (due to mlflow upgrade):
```
**********************************************************************
File "/__w/spark/spark/python/pyspark/pandas/mlflow.py", line 158, in pyspark.pandas.mlflow.load_model
Failed example:
    with mlflow.start_run():
        lr = LinearRegression()
        lr.fit(train_x, train_y)
        mlflow.sklearn.log_model(lr, "model")
Expected:
    LinearRegression(...)
Got:
    LinearRegression()
    <mlflow.models.model.ModelInfo object at 0x7fef9578deb0>
```

### Does this PR introduce _any_ user-facing change?
No, dev only

### How was this patch tested?
All CI passed

Closes apache#38304 from Yikun/SPARK-40838.

Authored-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants