Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not an issue more so a question/feature request #313

Closed
2 of 5 tasks
UTexas80 opened this issue Mar 23, 2024 · 32 comments
Closed
2 of 5 tasks

Not an issue more so a question/feature request #313

UTexas80 opened this issue Mar 23, 2024 · 32 comments

Comments

@UTexas80
Copy link
Contributor

Report an Issue / Request a Feature

I'm submitting a (Check one with "x") :

  • bug report
  • feature request

Issue Severity Classification -

(Check one with "x") :

  • 1 - Severe
  • 2 - Moderate
  • 3 - Low
Expected Behavior
It would be great if you could add python code within the munge folder and have it process as in the current comparable manner. I have added a "reticulate::source_python(python.py)" statement within my R code to initiate my python program to replicate the inherent Project Template functionality. Thank you for your time and consideration.
Current Behavior
Steps to Reproduce Behavior

insert short code snippets here

Screenshots
Version Information
Possible Solution
@KentonWhite
Copy link
Owner

I don't see why it couldn't be possible. I think it just requires a check for python files in load,.project.R:

  for (preprocessing.script in sort(dir(dir_name, pattern = munge_files())))
  {
    message(' Running preprocessing script: ', preprocessing.script)
    source(file.path(dir_name, preprocessing.script), local = .TargetEnv)
  }
  return(my.project.info)
}

just before the source (since it would need to run python).

Also need to change the munge file filter just above in munge_files()

If you would like to take a stab at adding this I'm super happy to help!

@UTexas80
Copy link
Contributor Author

Hello Kenton - Thank you for getting back to me so quickly, much appreciated. I would love to take a stab at adding this. I am looking through the load.project.R code and am excited to get this to run. Please let me know.

@KentonWhite
Copy link
Owner

The first step is to fork the project and see if you can modify load,.project.R to read and execute your python files. Once that is running we can work together on writing a test for the feature!

@UTexas80
Copy link
Contributor Author

UTexas80 commented Mar 26, 2024 via email

@UTexas80
Copy link
Contributor Author

Hello Kenton,

I forked ProjectTemplate and updated the munge_files function regex to include python files:

munge.files <- '[.][rR]|[.][pP][yY]$' # Add .py files

I was wondering how I would test this since the load.project.R code:

library("ProjectTemplate"); load.project()

points to the main branch?

@KentonWhite
Copy link
Owner

You van use devtools to install a local package:

https://devtools.r-lib.org/reference/install.html

@UTexas80
Copy link
Contributor Author

UTexas80 commented Mar 27, 2024 via email

@KentonWhite
Copy link
Owner

No worries :) After the first one it gets kind of addicting!

@UTexas80
Copy link
Contributor Author

Got it to recognize the .py file in the munge folder.

Thank you for your help Kenton, much appreciated!

And you are right, this can be kind of addicting.

@KentonWhite
Copy link
Owner

Yay! Next steps is writing a test. I'm away for Easter weekend and can give some suggestions on the test to write when I'm back.

@KentonWhite
Copy link
Owner

I'm back from Easter Holidays.

For a package to be released to CRAN it needs to have tests for the features using testthat. In the tests folder there is a file called test-munge.R with the unit tests for the load function.

I think we should add a test to the final section called 'pass munge files to run' where instead of making a .R files we make a .py file and test that it is loaded in with load.project. All the test file needs to me is something that creates a variable that can be checked it it exists.

Could you take a stab at adding this test?

@UTexas80
Copy link
Contributor Author

UTexas80 commented Apr 2, 2024 via email

@UTexas80
Copy link
Contributor Author

UTexas80 commented Apr 3, 2024 via email

@KentonWhite
Copy link
Owner

My bad, it is test-load.R. The munge section as at the very end.

@UTexas80
Copy link
Contributor Author

UTexas80 commented Apr 5, 2024

Thinking...

@UTexas80
Copy link
Contributor Author

UTexas80 commented Apr 9, 2024

Testing...

@UTexas80
Copy link
Contributor Author

Test Case: Interleaved Python and R Code Execution with Reticulate

Purpose: This test case verifies the ability of RStudio to seamlessly execute Python code interspersed with R code in a sequential order. The test utilizes the reticulate package to facilitate communication between R and Python environments.

Scope:

  • Functionality:
    • Loading and executing Python scripts within R Studio.
    • Importing Python libraries within R using reticulate.
    • Reading and writing files from Python code.
    • Capturing results from Python code within R.
  • Limitations:
    • Focuses on basic functionalities.
    • Doesn't test complex Python functionalities (e.g., object-oriented programming).

Test Design:

1. Test Environment:

  • RStudio IDE
  • testthat package installed
  • reticulate package installed
  • Python environment accessible from R

2. Test Data:

  • A temporary project directory will be created.
  • Python scripts will be dynamically generated within this directory.
  • An R script will be used to trigger the Python code execution.

3. Test Steps:

  1. Create a temporary project directory.
  2. Create a subdirectory named "munge" to store Python and R scripts.
  3. Define two python test scripts:
    • 01-test_data.py: Imports pandas and os, creates a dataFrame data, writes it to a CSV file (test_data_py.csv) and performs a calculation (e.g., sum of a column) and prints the result.
    • 02-test_data.py: Imports pandas, os and sys, reads/writes the CSV file test_data_py.csv) created in 01-test_data.py, creates a dataframe py_data, defines a variable subdirectory, checks if the subdirectory variable exists in the python environment, passes the result to a variable data, prints whether y or n, writes a dynamically named dataframe either y.csv or n.csv to the munge subdirectory
  4. Write the scripts to their respective files within the "munge" directory.
  5. Use reticulate's source function to sequentially load the Python scripts from the R Project Template package.
  6. Verify if the CSV files created by the Python script exist using except_false and file.exists from the testthat package.
  7. Verify if the python variables exist in the R environment using except_false from the testthat package.
  8. Execute the python script (01-test_data.py) to test capturing of the Python calculation result.
  9. Execute the R script (01-test_data.R) to test capturing of the R result tibble.
  10. Execute the python script (02-test_data.py) to test capturing of the Python environment result.
  11. Execute the R script (02-test_data.R) to test capturing of the R result tibble.

4. Expected Results:

  • All interspersed Python and R scripts should be alphanumerically loaded successfully without errors.
  • The Python script (01-test_data.py) should capture the expected result from the Python calculation and the expect_true, file.exists assertion should pass.
  • The Python script (02-test_data.py) should capture the expected result from the Python environment and the expect_true, file.exists assertion should pass.
  • The R script (01-test_data.R) should capture the expected result from the R calculation and the expect_true, file.exists assertion should pass.
  • The R script (02-test_data.R) should capture the expected result from the R environment and the expect_true, file.exists assertion should pass.
  • The CSV files (test_data_py.csv, write_test_data_py.csv and y.csv) created by the Python script should exist and the expect_true, file.exists assertion should pass.
  • The CSV file (n.csv) should not be created by the Python script and the expect_false, file.exists assertion should pass.
  • The data file (data) created in the (01-test_data.py) script should not be written to the R Environment and the expect_false, assertion should pass.

5. Pass/Fail Criteria:

  • The test case passes if all expected results are met.
  • The test case fails if any errors occur during Python and R script execution, file operations, or if the R assertion fails.

Additional Considerations:

  • This test case can be further expanded to include more complex Python functionalities and error handling scenarios.
  • The test script content (e.g., library imports, data manipulation) can be customized based on specific use cases.
  • Ensure proper library installations and environment configurations for Python and R.

Conclusion: This test case demonstrates the basic functionality of running Python code interspersed with R code using reticulate. By successfully passing this test, we gain confidence in RStudio's ability to integrate Python code within the R environment, allowing for flexible data analysis workflows that leverage the strengths of both languages.

1 similar comment
@UTexas80
Copy link
Contributor Author

Test Case: Interleaved Python and R Code Execution with Reticulate

Purpose: This test case verifies the ability of RStudio to seamlessly execute Python code interspersed with R code in a sequential order. The test utilizes the reticulate package to facilitate communication between R and Python environments.

Scope:

  • Functionality:
    • Loading and executing Python scripts within R Studio.
    • Importing Python libraries within R using reticulate.
    • Reading and writing files from Python code.
    • Capturing results from Python code within R.
  • Limitations:
    • Focuses on basic functionalities.
    • Doesn't test complex Python functionalities (e.g., object-oriented programming).

Test Design:

1. Test Environment:

  • RStudio IDE
  • testthat package installed
  • reticulate package installed
  • Python environment accessible from R

2. Test Data:

  • A temporary project directory will be created.
  • Python scripts will be dynamically generated within this directory.
  • An R script will be used to trigger the Python code execution.

3. Test Steps:

  1. Create a temporary project directory.
  2. Create a subdirectory named "munge" to store Python and R scripts.
  3. Define two python test scripts:
    • 01-test_data.py: Imports pandas and os, creates a dataFrame data, writes it to a CSV file (test_data_py.csv) and performs a calculation (e.g., sum of a column) and prints the result.
    • 02-test_data.py: Imports pandas, os and sys, reads/writes the CSV file test_data_py.csv) created in 01-test_data.py, creates a dataframe py_data, defines a variable subdirectory, checks if the subdirectory variable exists in the python environment, passes the result to a variable data, prints whether y or n, writes a dynamically named dataframe either y.csv or n.csv to the munge subdirectory
  4. Write the scripts to their respective files within the "munge" directory.
  5. Use reticulate's source function to sequentially load the Python scripts from the R Project Template package.
  6. Verify if the CSV files created by the Python script exist using except_false and file.exists from the testthat package.
  7. Verify if the python variables exist in the R environment using except_false from the testthat package.
  8. Execute the python script (01-test_data.py) to test capturing of the Python calculation result.
  9. Execute the R script (01-test_data.R) to test capturing of the R result tibble.
  10. Execute the python script (02-test_data.py) to test capturing of the Python environment result.
  11. Execute the R script (02-test_data.R) to test capturing of the R result tibble.

4. Expected Results:

  • All interspersed Python and R scripts should be alphanumerically loaded successfully without errors.
  • The Python script (01-test_data.py) should capture the expected result from the Python calculation and the expect_true, file.exists assertion should pass.
  • The Python script (02-test_data.py) should capture the expected result from the Python environment and the expect_true, file.exists assertion should pass.
  • The R script (01-test_data.R) should capture the expected result from the R calculation and the expect_true, file.exists assertion should pass.
  • The R script (02-test_data.R) should capture the expected result from the R environment and the expect_true, file.exists assertion should pass.
  • The CSV files (test_data_py.csv, write_test_data_py.csv and y.csv) created by the Python script should exist and the expect_true, file.exists assertion should pass.
  • The CSV file (n.csv) should not be created by the Python script and the expect_false, file.exists assertion should pass.
  • The data file (data) created in the (01-test_data.py) script should not be written to the R Environment and the expect_false, assertion should pass.

5. Pass/Fail Criteria:

  • The test case passes if all expected results are met.
  • The test case fails if any errors occur during Python and R script execution, file operations, or if the R assertion fails.

Additional Considerations:

  • This test case can be further expanded to include more complex Python functionalities and error handling scenarios.
  • The test script content (e.g., library imports, data manipulation) can be customized based on specific use cases.
  • Ensure proper library installations and environment configurations for Python and R.

Conclusion: This test case demonstrates the basic functionality of running Python code interspersed with R code using reticulate. By successfully passing this test, we gain confidence in RStudio's ability to integrate Python code within the R environment, allowing for flexible data analysis workflows that leverage the strengths of both languages.

@UTexas80
Copy link
Contributor Author

@KentonWhite
Copy link
Owner

Been running into troubles getting this PR working with Travis.ci The issue is that the standard R build for travis runs on Xenial, which has Python2.7 and minimal support for Python3 (only Python3.5 and no pip3 support). This causes a problem with Reticulate, which really requires Python3.6 or higher.

Meanwhile, the Bionic build, which has great Python support, lacks good R support. It has R 4.0 out of box, which doesn't support dynamic loading of packages. This causes problems with the Tibble package, which requires dynamic loading.

The solution I'm exploring at the moment is either 1) installing Python3.6 directly on Xenial or 2) installing R release directly on Bionic.

@UTexas80
Copy link
Contributor Author

UTexas80 commented Apr 22, 2024 via email

@KentonWhite
Copy link
Owner

Travis build is working and everything is passing. Next step is pushing to CRAN.

@UTexas80
Copy link
Contributor Author

UTexas80 commented Apr 28, 2024 via email

@KentonWhite
Copy link
Owner

The test servers on CRAN don't have pandas installed in their python installation. This is causing tests to fail and can't submit to CRAN with failing tests.

Is it possible to re-write the tests so that they don't use pandas or other packages not part of the base python installation?

@UTexas80
Copy link
Contributor Author

UTexas80 commented May 22, 2024 via email

@UTexas80
Copy link
Contributor Author

UTexas80 commented Jun 1, 2024

Working on it...

@UTexas80
Copy link
Contributor Author

Test Complete: re-wrote the tests so that they don't use pandas or other packages not part of the base python installation.

@KentonWhite
Copy link
Owner

I'm not finding the new tests. Can you submit a new pull request with the changed tests please.

@UTexas80
Copy link
Contributor Author

UTexas80 commented Jun 24, 2024 via email

@KentonWhite
Copy link
Owner

On its way to CRAN!

@UTexas80
Copy link
Contributor Author

UTexas80 commented Jul 1, 2024 via email

@UTexas80
Copy link
Contributor Author

UTexas80 commented Jul 9, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants