Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File line-endings? #141

Open
jimlawton opened this issue Sep 30, 2016 · 9 comments
Open

File line-endings? #141

jimlawton opened this issue Sep 30, 2016 · 9 comments
Assignees

Comments

@jimlawton
Copy link
Collaborator

Just wondering what file line-endings are to be preferred for transcriptions?
Since most of us are Linux/Mac users, I imagine that LF would be preferred?
The Aurora transcription doesn't specify I think.

I think default Git settings will do a reasonable job. However, anyone can override their client settings and check in CRLF, or worse a mix of CRLF and LF.

I just ran yaYUL --format on a module, and it generated a file with CRLF line endings. Is this built into yaYUL?

@jimlawton
Copy link
Collaborator Author

OK, forget about yaYUL --format. I was running it via Docker as follows:

$ docker run --rm -ti jlawton/virtualagc /bin/sh -c 'cd /virtualagc/Aurora12 && ../yaYUL/yaYUL --format ERASABLE_ASSIGNMENTS.agc' > ERASABLE_ASSIGNMENTS.agc

If I check the file type inside the container it is LF. The CRs are getting added somehow by the act of redirecting Docker stdout to a file.

@jimlawton
Copy link
Collaborator Author

Yep, when I use docker cp to copy the file out, it's fine.

@rburkey2005
Copy link
Member

I'm not sure what the to-do is here, other than to answer the question as to what is preferred.

I think the software tries to be agnostic about it, but I was using UNIX style (LF-only) for years before anyone else came on board in this project, so naturally they're everywhere now. As you say, Windows uses CR-LF and is much, much more popular. I don't know what Mac uses now, but they used to use CR-only, which was a real pain to deal with, and probably really does break the software.

Rather than go in and fix some huge number of files at this point, and to maintain consistency, my personal preference would be to continue to go with LF-only, and just hope people are using editors that are smart enough to handle this. On Windows, for example, Wordpad can handle it, but Notepad cannot, in my experience, and I don't know about Word. On the other hand, Wordpad defaults to having unsuitable tab-stops (i.e., not 8 spaces), so I can't claim it's ideal either.

But the plain truth is that except for Outlook (to look at my company email), I only use Windows extremely rarely (to do my taxes, say, or if I have to run Dragon Naturally Speaking), and I run Mac OS only every couple of years. So I have almost no insight into what may or may not be easy for the majority of people to work with.

What would be cool is if we could simply recommend some cross-platform editor (or different editors for different platforms) that properly supported LF, tabs, and AGC syntax-highlighting to people. If people wanted to use that (or them), fine, problem solved, and if not, then they'd be no worse off than before.

@jimlawton
Copy link
Collaborator Author

Mac OS X is LF now also.

By default Git will let the user use whatever they like and convert to LF on commit, storing the LF representation in the repo. On checkout, it converts it back to whatever the native representation is.

Maybe we should just put a note in the transcription wiki page to say that Unix line-endings are preferred, and that such-and-such a Git setting should be used if you're using Windows?

@rburkey2005
Copy link
Member

Why don't you write that up, since you obviously know about it? I didn't even know about GitHub settings. :-)

@jimlawton
Copy link
Collaborator Author

OK. Assigning to myself.

@texadactyl
Copy link

If you are ever confused and need to standardize on 'nix ending, dos2unix will convert Mac or DOS/Windows text endings to Unix/Linux.

@texadactyl
Copy link

texadactyl commented Jan 29, 2018

If you haven't yet seen any documentation on the .gitattributes file, there is a good write-up here: http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/

I think that this file can be specified to force conversion from CRLF to LF when CRLF is encountered in uploaded text files. Admittedly, I haven't tried it since I haven't used Windows in many years and I haven't shared a repo with a Windows user yet.

@texadactyl
Copy link

texadactyl commented Jan 29, 2018

@jimlawton I didn't think that I'd have time to do this now but I got lucky. Here is what I found out: You can have a repo that forces all users to check-in/push changes in LF ('nix) line ending format providing they do not change or delete the .gitattributes file (created below). This set of steps will also fix all text files currently inside the repo during the git commit operation.

git clone "github project url"
git config --global user.email "you@example.com"
git config --global user.name "Github User Name"
echo "* text=auto" >>.gitattributes
rm .git/index     # Remove the index to forcegit to
git reset         # re-scan the working directory
git status        # Show files that will be normalized
git add -u
git add .gitattributes
git commit -m "Introduce end-of-line normalization"
git push

Admittedly, this will inconvenience the Windows contributors of text files but it automates the maintenance of text file line endings into a single format. I hope that helps you save some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants