Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove read.cep() from vegan? #263

Closed
jarioksa opened this issue Jan 3, 2018 · 5 comments
Closed

Remove read.cep() from vegan? #263

jarioksa opened this issue Jan 3, 2018 · 5 comments

Comments

@jarioksa
Copy link
Contributor

jarioksa commented Jan 3, 2018

We have had function read.cep() to read legacy CEP-style ("Canoco") files in vegan since 2003. These files (except free format versions) use old FORTRAN format string to define the input. The only practical way I know to interpret this format and read in the data is to use FORTRAN code. For instance, base R function read.fortran() cannot interpret and use this format. It seems that FORTRAN input is blacklisted in current R devel, at least since commit r74028 | ripley | 2018-01-03 11:33:13 +0200 (Wed, 03 Jan 2018), and these NOTEs appear in CRAN tests. This is a similar note is from my own desktop:

* checking compiled code ... NOTE
File ‘vegan/libs/vegan.so’:
  Found ‘_gfortran_st_close’, possibly from ‘close’ (Fortran)
    Object: ‘cepin.o’
  Found ‘_gfortran_st_open’, possibly from ‘open’ (Fortran)
    Object: ‘cepin.o’
  Found ‘_gfortran_st_read’, possibly from ‘read’ (Fortran)
    Object: ‘cepin.o’

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs.

So the question is: do we need function to read legacy CEP files or should we remove the function?

I personally have these files, and occasionally come across such files, but that is not a problem to me, because I can have a non-CRAN function to read them. How useful this function is in modern times? When I added this function back in February 2003, it was a critical piece to help people to migrate to R and vegan, but it was a long ago.

@jarioksa
Copy link
Contributor Author

jarioksa commented Jan 5, 2018

Now it is official, and I got this email from CRAN:

From: Prof Brian Ripley ripley@stats.ox.ac.uk
Subject: CRAN packages using Fortran I/O
Date: Fri, 5 Jan 2018 10:48:01 +0000

This concerns packages

leaps OriGen vegan

As the manual says

We have already warned against the use of C++ iostreams not least because output is not guaranteed to appear on the R console, and this warning applies equally to Fortran (77 or 9x) output to units * and 6 ...

In particular, any package that makes use of Fortran I/O will when
compiled on Windows interfere with C I/O: when the Fortran I/O is
initialized (typically when the package is loaded) the C stdout and
stderr are switched to LF line endings. (Function init in file
src/modules/lapack/init_win.c shows how to mitigate this.)

Package vegan uses Fortran I/O to read a text file. Could this be done
by R functions such as read.fortran? If this is not possible, you could
move cepin.f to a separate DLL (or even a separate package) so that
Fortran I/O is only loaded when necessary. Or follow the hint quoted above.

--
Brian D. Ripley, ripley@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

@jarioksa
Copy link
Contributor Author

jarioksa commented Jan 19, 2018

The consensus after email correspondence was that it would be useful to retain capability of reading legacy CEP and CANOCO files. B.D.Ripley hinted in an email that CRAN could accept a separate package if we remove Fortran I/O from vegan. We wrote package cepreader, but it was not accepted in CRAN (it was neither rejected, but it just vanished into CRAN bin).

Now I try to rescue the functionality by re-writing all in branch R-read.cep in R only. This will fail in more complicated cases that can be handled in cepreader, but we can have this function in vegan, and point to cepreader in case this fails.

@gavinsimpson
Copy link
Contributor

This sounds good to me Jari.

@jarioksa
Copy link
Contributor Author

jarioksa commented Jan 22, 2018

I have now implemented read.cep in R only and merged the changes (b49c48c).

I have tried this with the legacy CEP files I had laying around among my files, and the new file worked with all these. There are some quirks and problems with the new R code:

  1. The whole file is first read in with readLines as text file, and this file is then interpreted with read.fortran. This is very slow in large data sets, but I don't see this as a big problem.

  2. read.fortran has a peculiar interpretation of F format with decimals. The data are first read in and interpreted as numbers, but if there are decimal places in the format, the numbers are divided by 10n. For instance, if the format is F5.1, number 100 will be turned to 10, and 0.1 to 0.01. The interpretation is correct if the data are given without decimals: 100 is given as 1000 and 0.1 as 1, but if decimal numbers were used, they are wrongly represented. I try to undo this, but there may be cases where this fails.

  3. Original Fortran code could also handle open and free format Canoco data, but these files are rejected now. It would not be too difficult to add these formats, but naturally, the code would become messy. I have no plans of adding this facility to read.cep, but contributions are welcome...

I have merged the changes in the cran-2.4 branch, and I intend to push vegan-2.4-6 to CRAN within a few days. I'll have yet another try with cepreader, too. However, it may be rejected for ever. In that case we should offer a binary in github. cepreader uses Fortran, and most users cannot build a binary, but we must provide one. @gavinsimpson suggested this can be easily done, but I do not know how.

@jarioksa
Copy link
Contributor Author

version 2.4-6 was pushed to CRAN today without Fortran I/O code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants