Skip to content

Conversation

jklwn4
Copy link

@jklwn4 jklwn4 commented Sep 15, 2019

This PR is not ready now, commit by commit i will add more file/path related functions for wide string file/path names in Windows.

JK

@jklwn4
Copy link
Author

jklwn4 commented Sep 15, 2019

I have no idea why Travis keeps failing, i can compile and run the compiler and the RTL in Windows without problems.

@jayrm
Copy link
Member

jayrm commented Sep 15, 2019

  1. If you look at the details of the Travis CI build log, can see that:
src/rtlib/file_datetime_w.c: In function ‘fb_FileDateTimeW’:
src/rtlib/file_datetime_w.c:14:6: error: implicit declaration of function ‘wstat’ [-Werror=implicit-function-declaration]
  if( wstat( filename, &buf ) != 0 )
      ^
compilation terminated due to -Wfatal-errors.
  1. The initial try here kind of misses the design of the rtlib for handling file I/O. src/rtlib/file_*.c files are the API to fbc and do not call system functions directly. They call the "device driver" for whatever device is opened. Not every device (e.g. file/printer/serial) will support every kind of file I/O operation. For example the fflush() function is already in dev_file_flush.c. See fb_file.h:FB_FILE_HOOKS. Basically the only thing missing is a way to expose it from fbc user code.

  2. If the implementation is platform specific, i.e. win32 then the preferred location is in ./src/rtlib/win32 directory, and avoid #ifdef conditionals in ./src/rtlib. Not a rule, just a guideline.

  3. you don't need to create a new user for every pull request. Just use different branches. You can clone to multiple directories on your pc and work on different branches all under same user. I see you now have 4 user names. FYI, I just fetch them all in to one repo on my own pc.

  4. in inc/file.bi the declare functions should probably be rewritten as overloaded functions, using fb's overload resolution to select zstring or wstring function call.

  5. Perhaps choose one singular thing to add and make that a pull request, and we can just deal with that one thing completely, to the finish line.

@jklwn4
Copy link
Author

jklwn4 commented Sep 15, 2019

Jeff,

thanks for your help!

  1. i found the log, thanks for your hint
    2/3. i just cloned wide string versions of already existing files, i added a new file for FLUSH and SETEOF, which should work for all targets.
  2. i´m still struggling with GIT, so the easiest solution was to have more than one fork, i will delete these forks and accounts, when i don´t need then anymore.
  3. this was my first approach, but then i realized that i don´t have to change anything in Linux, i only need to change the Windows part for wide strings. Windows internally converts all ...A functions to ...W functions anyway. So why not handle the Windows part for all file/path related statements as wide strings and only as wide strings. STRING/ZSTRING will be converted to WSTRING automatically, so in the RTL, we would need only W-functions in Windows - what do you think?

I think, i will close this PR and start a new one.

JK

@jayrm
Copy link
Member

jayrm commented Sep 15, 2019

Well, you don't have to close this PR. My opinion, is that if you are going to use the PR as a development tool, it would be nice just to keep the whole discussion here. You can force push your changes and basically overwrite what you already have. Even if you close the PR, github seems to keep the history forever anyway, and I will probably reference your new PR to here. Anyway, I'm not sure what your remote name is, but would be something like git push jklwn4 File --force assuming your remote name is jklwn4. Or if you are using a GUI, then there is probably an option for force should be available.

As for the device drivers, as I mentioned, the FLUSH function is already in dev_file_flush.c. So yes, you do need to add a file_flush.c as the API, but it should call the device driver through FB_FILE_HOOKS. You will notice FB_FILE_HOOKS in dev_file_open.c, dev_lpt.c, dev_pipe_open.c, dev_scrn.c, etc, that not every device supports every file I/O operation, and some devices do not have an entry for FLUSH function. The purpose of file_flush.c would be to provide API to fbc and then call the FLUSH function through the device driver, or if it is not available, always return error/success, etc. The reason for this layer of abstraction is that some devices can not use standard C runtime file I/O and individual code for each platform has to be written.

@jklwn4
Copy link
Author

jklwn4 commented Sep 15, 2019

This Linux stuff is driving me crazy, maybe i should restrict my code to Windows and leave the rest to someone with more experience of the other targets.

@jayrm
Copy link
Member

jayrm commented Sep 15, 2019

This Linux stuff is driving me crazy, maybe i should restrict my code to Windows and leave the rest to someone with more experience of the other targets.

Yes, my opinion, the main reason for selecting one single thing to improve on, and then taking that one single thing to the finish line. For example, FLUSH command (which is first commit), added through inc/file.bi should actually be fairly straight forward. Do just that one thing first, get it out of the way, then move on to the next.

@jklwn4
Copy link
Author

jklwn4 commented Sep 16, 2019

Thanks for the guidance, i hope i got it right now.

@jayrm
Copy link
Member

jayrm commented Sep 19, 2019

@jklwn4, @jklwn3
I was just looking over your commit for FLUSH command yesterday (2 days after your force push). It looks pretty good. There's a few things I would change. I guess I was a little slow (it's only been 3 days now), or you were a little impatient, because you are adding more stuff now... :)

The changes are minor, and if it is OK with you, I can make the changes and push to this pull request:

  • file_flush.c:fb_Flush() to be named fb_FileFlush() for consistency with other file functions.
  • inc/file.bi:flush() to be named FileFlush() for consistency with other extend file function.
  • in inc/file.bi , the handle parameters should be renamed filenumber, because filenumber is the correct name and is how it is referred to in documentation. (Your addition is correct, existing declarations are wrong).
  • fb_FileFlush() should have prototype in src/rtlib/fb_file.h
  • in file_flush.c:fb_FileFlush(), don't check handle->type != FB_FILE_TYPE_NORMAL. Let the device driver decide. If the device driver has a pfnFlush() hook, then it's valid. For example, FB_FILE_TYPE_CONSOLE also uses the file device driver. And potentially, could add FileFlush functionality to other device drivers, if it makes sense to do so, like printer or serial.
  • FB_UNLOCK() must be called after call to pfnFlush(), otherwise the lock is lost. Need to use something like res = handle->hooks->pfnFlush( handle ); and return that result.

There's one other change could make, but it's not currently needed. Just thought I would mention in case it is needed in future: many of the file functions have a fb_FileXyz(int fnum) function and a fb_FileXyzEx( FB_FILE *handle ). The first is API for fbc, and second is API for rtlib itself.

Sorry, I haven't looked at the SetEof addition yet (or perhaps FileTruncate()?). Probably will take me a few days before I can review. Please, don't add more stuff until we resolve some issues first. I have commits ready to push to this PR for flush => FileFlush. Do you want me to go ahead and push these commits? Or you want to make changes? Thanks.

@jklwn4
Copy link
Author

jklwn4 commented Sep 19, 2019

Go ahead, make the necessary changes and push it! In general feel free to "fine tune" my PRs whenever and whereever necessary.

Yes, i was impatient. Should i make a separate PR for each change or new feature? This is likely to result in a lot of PRs, but if this is easier/better for you to handle, i will do it.

JK

later: regarding SETEOF vs. FILETRUNCATE, i would prefer SETEOF, because it is shorter and it is as self-explanatory as FILETRUNCATE (EOF = end of file, a user is already familiar with the meaning of EOF).

@jklwn4
Copy link
Author

jklwn4 commented Sep 20, 2019

Reading more about fflush, i realize that this doesnt ensure the file to be written to disk. It flushes internal C buffers to the underlying file system, but it doesn´t flush the file system´s buffers. You must use FlushFileBuffers (handle) in Windows and fsync (fp) in Linux to do that. So this is still missing and should be added!

The more, the return value of fflush should be tested for success or error.

JK

@rversteegen
Copy link
Member

Thanks for working on this, unicode filename support in Windows is sorely missing!

Re: the strategy of just implementing wide versions of certain rtlib functions (eg FILELEN): note that will break support for Windows 95/98/ME, unless people link against unicows.lib and distribute unicows.dll with their programs. Wouldn't it only be a small step to use overloading instead of #ifdefing a different set of functions, as Jeff said, and preserve Win 95/98/ME support? Just earlier this year I ported my FB program to Win 95 and 98 after a user asked for it. But I feel like a jerk for mentioning this, since Win ME was released 19 years ago, and I'm just causing trouble...

There's a quirk when the Windows ANSI codepage is set to Japanese: \ and the yen symbol are the same character! This breaks a lot of file path stuff. So make sure that everything is tested on a computer set to Japanese codepage.

@jklwn4
Copy link
Author

jklwn4 commented Sep 21, 2019

Well, i didn´t expect that Windows 95 is still in use these days. Windows XP - yes, but Windows 95?
The cleaner solution is to handle everything in Unicode in Windows, but this breaks Windows 95 et.al., that´s true.

Jeff, what do you think, should we still support Windows 95? This would indeed require a slightly different approach from what i proposed.

Thanks for the hint (\ vs. yen), but how could we avoid it and how could we test it? Looks like we need a Japanese ;-) ?

JK

@rversteegen
Copy link
Member

rversteegen commented Sep 21, 2019

I'm not actually aware of anything we need to do to avoid breaking because of the Japanese codepage. I think it means converting from ANSI to UTF16 and back again is lossy, and ANSI->UTF16 might break paths? I'd be happy to run the tests under a Japanese codepage on my Win XP VM, since I already installed it.

I think that there are far more Win 98 machines than Win 95.

@jklwn4
Copy link
Author

jklwn4 commented Sep 21, 2019

There is no conversion ANSI -> UTF16 and back again, it´s only ANSI -> UTF16 in Windows. Only the place of the conversion may differ. If we pass an ANSI string to a ...A Windows API function, this ANSI string is converted to UTF16 by Windows internally and the corresponding ...W Windows API function is called. That is, all ...A functions are merely wrappers of the corresponding ...W functions.

So if we call only the ...W functions from FB and convert ANSI to UTF16 in FB, the conversion takes place in FB and no conversion is needed in Windows anymore. The result should be the same and there are no superfluous conversions.

Feel free to test the code of #180 with a Japanes code page and tell me if there are problems or not.

JK

jklwn3 and others added 10 commits September 22, 2019 17:02
- change flush function naming same as other extended file functions
- FileFlush() in inc/file.bi
- fb_FileFlush() in src/rtlib/file_flush.c
- use 'filenumber' instead of 'handle' in inc/file.bi
- add prototype to for fb_FileFlush() to src/rtlib/fb_file.h
- no need to check for FB_FILE_TYPE_NORMAL, if device has pfnFlush function ptr set, then it's OK to call
- besides, OPEN "CON" is FB_FILE_TYPE_CONSOLE, and calling pfnFlush is OK
- FB_UNLOCK() must be after the call to pfnFlush, so fb_FileFlush retains locks until after fflush() call
@jayrm
Copy link
Member

jayrm commented Sep 22, 2019

force pushed for rebase on to current fbc/master

Adds the FileFlush function.

@jayrm
Copy link
Member

jayrm commented Sep 22, 2019

That's interesting. github appears to display the commits by commit author date order rather than the actual commit order. So the commits appear out of order when viewing the pull request on github, but appear in actual application order when viewing the history with git log.

The FileFlush statement is ready to merge in to master. It should generally be allowed on any output/writable stream. FileFlush might make sense on Serial/Printer, but it would have to be a platform specific implementation since each platform handles these devices uniquely.

Next, there is the SetEof => FileTruncate commit to deal with. I think I will just push new commits rather than commenting on all the details, except for the following:

  • unistd.h is already used for a couple other rtlib source files in the top level API source files, but wrapped in a #ifdef HOST_MINGW, even though unistd.h should be available. ftruncate appears to be available. Should be proven with a test.
  • because mixing file stream handles (i.e. fopen) and file descriptor calls (ftruncate), seems prudent to call a fflush before truncating. Have to watch this for behaviour problems.

I think once the truncate function is added, this will be next pull request to merge to fbc/master.

@jklwn
Copy link

jklwn commented Sep 23, 2019

Jeff,

i don´t know, if you noticed my above comment:

Reading more about fflush, i realize that this doesnt ensure the file to be written to disk. It flushes internal C buffers to the underlying file system, but it doesn´t flush the file system´s buffers. You must use FlushFileBuffers (handle) in Windows and fsync (fp) in Linux to do that. So this is still missing and should be added!

- FileFlush has code (for fflush) in the device driver to flush application buffers
- FileFlush if passed non-zero flag, will also call platform specific code to flush system buffer
- FileTruncate removed from file device driver
- FileTruncate added to platform specific code
@jayrm
Copy link
Member

jayrm commented Sep 25, 2019

@rversteegen, thanks, I did notice that. FileTruncate position is based on Seek aka ftello. If ftello maps to the 64-bit version, then so must ftruncate. But have not tested truncate yet with large files on all systems, so will have to watch for that. For example, if ftello maps to 64-bit version, but ftruncate maps to 32-bit, then will get bad results on large files >= 2^32.

@jklwn
Copy link

jklwn commented Sep 25, 2019

Thanks Jeff, i see the new statements follow the naming convention of file.bi. Could we have (additional) short forms of the "FILE..." statements in file.bi too? Obviously FILELEN must remain FILELEN, but FILEFLUSH could be just FLUSH, the same applies to TRUNCATE, EXISTS, etc., you don´t have FILEOPEN, FILEKILL, FILECLOSE, etc., you have OPEN, KILL, CLOSE, etc.

@jayrm
Copy link
Member

jayrm commented Sep 25, 2019

@jklwn, I did notice your comment. Yes, it requires adding platform specific code, which I've done in the last several commits.

My understanding is this:

  • fflush overrides the application's default buffering, and writes out buffers to the system. So, if the application were to suddenly crash, the data will be saved in the system buffers
  • FlushFileBuffers and fsync override the system default buffering and writes out buffers to the device. So, if the system were to suddenly crash, the data will be saved on the device (assuming that the system's device driver actually honors the request, poorly written or non-compliant drivers may ignore the request).

So, I've added the FileFlush feature in this way:
declare function FileFlush alias "fb_FileFlush" ( byval filenumber as long, byval systembuffer as long = 0 ) as long

  • by default, only the application buffers are flushed
  • by passing an extra non-zero value as second parameter (e.g. true), the system buffers will also be flushed.

For the FileTruncate feature, I've removed it from fbc's device driver. Any code we add to the device driver increases the size of the executable, even if not all functions are used. So we try to keep the device driver code small as possible. Instead, because FileTruncate eventually calls platform specific code, we just provide an entry point separate from fb's device driver. However, that means it can probably be called on more types of files than it really should. So, have to watch that.

I have not added the explicit call to fb_FileFlushEx() from fb_FileTruncateEx(), but that would be one way to check that FileTruncate is valid. If FileFlush is valid for the device (part of the device driver), then so should FileTruncate, otherwise return an error.

Because we are adding platform specific code, have to watch for breaking builds for platforms not yet tested. Hoping that the generic unix implementation will work on all platforms other than linux, dos, win, which were tested.

So, looks like a few more tests to do, and another self review of the code before merging in.

@jklwn
Copy link

jklwn commented Sep 25, 2019

For example, if ftello maps to 64-bit version, but ftruncate maps to 32-bit, then will get bad results on large files >= 2^32.

I don´t see, how this could happen, when ftello is followed immediately by ftruncate (just as we coded).

@jklwn
Copy link

jklwn commented Sep 25, 2019

by default, only the application buffers are flushed

i disagree in this point. I would let it default the other way round. When using FILEFLUSH i must have a special reason for doing so, because normally i would let the the system take care for saving my data to a device. So when using FILEFLUSH, i would expect the data to be written to the (physical) device immediately by default.

The additional parameter could be used to only flush the application buffers. Maybe in order to speed things up, because FlushFileBuffers and fsync block the application until the data is actually written to the device.

@jayrm
Copy link
Member

jayrm commented Sep 26, 2019

@jklwn,

Thanks Jeff, i see the new statements follow the naming convention of file.bi. Could we have (additional) short forms of the "FILE..." statements in file.bi too?

I can see that. But not in this pull request. It would be unfair to users to force a whole bunch of new common / short names in to the global name space. In fact, there are users that would like to see a method of having all fbc keywords in a namespace. For quirk keywords, this is challenging as it would require a pseudo namespace created by the compiler, but it's not impossible.

Perhaps a combination of namespace fb and extern "rtlib" will be the solution here, effectively designing & providing an alternate interface to fb's run time library.

@jayrm
Copy link
Member

jayrm commented Sep 26, 2019

@jklwn,

For example, if ftello maps to 64-bit version, but ftruncate maps to 32-bit, then will get bad results on large files >= 2^32.

I don´t see, how this could happen, when ftello is followed immediately by ftruncate (just as we coded).

You have to look deeper to see the issue. And, actually I should have said 2^31-1 since fb_off_t aka off_t, if it is 32-bit will be signed.

ftello and fseeko will map to a 32-bit or a 64-bit version. Same for ftruncate which might map to a 32-bit version or 64-bit version ftruncate64. I think I have this fixed in the next set of commits coming up.

If ftello maps to the 64-bit version, but ftruncate maps to the 32-bit version, then any files >= 2^31 will potentially truncate to the wrong size, or simply fail, if the length passed (64-bit signed cast to 32-bit signed) is negative.

Actually, effective if ftruncate limit is 2^31-2 for some platforms, like DOS, since fb's seek statement is one based. It's not possible to seek to position 2^31 on DOS, which currently is strictly 32-bit based, though newer versions of DJGPP have ftello64 and fseek64 variants.

I didn't include the following test in the test-suite as it generates a ~2GiB file. Since it's possible to use FileTruncate aka ftruncate to both reduce and increase the size of a file, it makes it convenient to test right around the around the 2^31 file size,

#include once "file.bi"

sub Test( byval size as longint )
	print "Testing size = &h" & hex(size)
	const filename = "temp.dat"
	dim idx as longint = size+1
	open filename for binary as #1
	print "  idx = &h" & hex(idx)
	seek #1, idx
	print "  pos = &h" & hex( seek(1) )
	if( FileTruncate( 1 ) <> 0 ) then
	print "  FileTruncate() failed"
	end if 
	print "  lof = &h" & hex(lof(1))
	if( size = lof(1) ) then
		print "  PASSED"
	else
		print "  FAILED"
	end if
	close #1
	kill filename
	print
end sub

Test( 0 )
Test( 2^31-2 )
Test( 2^31-1 )
Test( 2^31 )
Test( 2^31+1 )

Running this test on commit b67995c results in failures on mingw-w64/mingw32, mingw-w64/mingw64, and mingw (org).

I fixed mingw-w64 by using correct ftruncate / ftruncate64 but, and found that it would still fail on mingw (org) because mingw (org) does not have a ftruncate64 function. So instead I had to call windows function SetEndOfFile() API function under mingw (org). I would prefer to use general mapping of ftruncate, but this is what we have for now, and maybe it can get cleaned up in future.

On DOS, the test fails, but for a different reason, file positioning and ftruncate are 32-bit only.

On Linux based tests, it generally passes, because the mapping of ftruncate is handled more consistently by the host system.

Anyway, I think I have a reasonable fix coming up in the next set of commits.

@jayrm
Copy link
Member

jayrm commented Sep 26, 2019

@jklwn

by default, only the application buffers are flushed

i disagree in this point. I would let it default the other way round.

I tried your way, the other way around, and it feels backwards; passing a flag to do something less.

There are a couple of patterns I am following with this design:

SLEEP, SLEEP msec, SLEEP msec, 1 is fairly well known pattern in fbc, where the last form specifying the additional parameter is the one that blocks.

Also like CLOSE, and CLOSE fnum pattern

FileFlush will flush all open file streams
FileFlush fnum will flush specified open file stream
FileFlush fnum, 1 will flush specified open file stream, and system's file buffer to device

@jklwn4
Copy link
Author

jklwn4 commented Sep 26, 2019

I tried your way, the other way around, and it feels backwards; passing a flag to do something less.

Accepted

I don´t share your concern about new names in the global namespace. Adding reasonable and useful new features requires adding new reserved key words, that´s a fact. You don´t ask for too much of FreeBASIC users in this respect, i think.

@jayrm jayrm changed the title File FileFlush and FileTruncate functions Sep 27, 2019
@jayrm jayrm merged commit a22f904 into freebasic:master Sep 27, 2019
@jklwn4
Copy link
Author

jklwn4 commented Sep 27, 2019

One last comment: "FileTruncate" might be missleading, because it can make a file smaller and larger as well. "Truncate" implies making smaller only. I would prefer something like my initial proposal "Seteof" (Set end of file), which implies both.

@jayrm
Copy link
Member

jayrm commented Sep 30, 2019

@jklwn4

One last comment: "FileTruncate" might be missleading, because it can make a file smaller and larger as well. "Truncate" implies making smaller only. I would prefer something like my initial proposal "Seteof" (Set end of file), which implies both.

OK. In pr #181

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants