Students often have problems answering quiz questions related to the xlsx package that is used to read Excel spreadsheets. This article highlights common problems and their solutions.
First, many students new to the Data Science Specialization have not previously needed to install a Java runtime on their computers. The xlsx
package depends on the rJava
and xlsxjars
packages. rJava
requires the Java Runtime Environment 1.2 or above to also be present on the student's computer.
If a student attempts to load the xlsx package without a Java runtime envrionment installed, s/he will receive the following message (in Mac OSX, similar error will display in Windows):
PRO TIP: The easiest way to work around this problem is to use an R package that does not depend on Java, such as openxlsx or readxl.
For openxlsx
, it's very easy.
install.packages("openxlsx")
library(openxlsx)
# read the help file to identify the arguments needed to
# correctly read the file
?openxlsx
theData <- read.xlsx(...)
The same process can be used for readxl
.
install.packages("readxl")
library(readxl)
# read the help file to identify the arguments needed to
# correctly read the file
?readxl
theData <- read_excel(...)
That said, for students who want to use the xlsx
package to answer the question, there are workable solutions for Windows, Mac OSX, and Ubuntu Linux.
SOLUTION (Windows): Download and install the latest version of the Java Runtime Environment from Oracle. Note that if you are running the 64-bit version of R, you need to install the 64-bit version of the Java Runtime.
SOLUTION (Mac OSX): As of newer releases of Mac OSX, this has become more complicated. A specific set of commands needs to be followed after installing the Java Development Kit on the computer. These are documented on the rJava Issue 86 github page. I have included a screenshot of this solution for students to reference directly.
SOLUTION (Ubuntu): Use the Ubuntu Advanced Packaging Tool to install Java, then reconfigure Java in R.
sudo apt-get install openjdk-8-jdk # openjdk-9-jdk has some installation issues
sudo R CMD javareconf
Then in R / RStudio install the xlsx
package.
install.packages("xlsx")
Another common problem students may encounter is an incompatibility between the version of the Java Runtime Environment that is installed on their computer and the version of R, either 32-bit or 64-bit.
For example, if one has installed the 64-bit version of R but has the 32-bit version of Java Runtime Environment installed, R will not have visibility to the Java Runtime Environment, generating the same "Java not installed error" as noted above.
SOLUTION: This problem can be resolved by either installing the 64-bit version of Java Runtime for Windows, or by changing the RStudio configuration to use the 32-bit version of R.
Note that as of July 2020, users on Stackoverflow.com have reported problems installing Java and rJava in the scenario where the version of Windows is a non-English language version (e.g. Chinese, Polish, etc.). It appears that the way the Java installer works with these versions of Windows, R and the rJava
package are not able to access the JAVA_HOME
directory correctly.
To correct the problem, reinstall R with the same language used by Windows. That is, on the Chinese version of Windows, install R with Chinese langauge support. Once installed, you can change the language to English by setting language = "en"
in the .Rconsole
file.
There are four different packages that allow R users to load Excel spreadsheets into R. An overview of these packages may be found at Reading Excel Files.
last updated 30 December 2020