Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
398 lines (239 sloc) 18.5 KB

Introduction to GitHub

Overview

The aim for today is to learn the basics of GitHub so that you can use it for your own projects.

Background

  • GitHub was originally developed to manage the development of large-scale software projects e.g. Unix. Today's major user of GitHub is Microsoft, who recently acquired it.

  • Although designed for software management at first, it is now used for many other purposes and disciplines. Widely-used in academia, industry and government in different contexts.

  • Record and access the history of a project: keep track of versions during project development e.g. the project status 10 days ago

  • It is findable (repositories available online through www.github.com and with embedded search capabilities), accessible (via any internet browser) and interoperable (easy interaction with any operating system - Mac, Linux, Windows).

What is version control? What is Git? What is GitHub?

Version control is the management of changes (a.k.a. revisions) to any types of information

  • Effectively "save" your work at important points in time and come back to any of the saved points. You may lose information but can recover and go back from the mistakes as it provides offsite backup in a remote server
  • In its simplest form, creating copies and changing file names, e.g. adding v1.0, v1.1, v2.0
  • It makes collaboration much easier:
    • Using tools that (to some extent) incorporate version control functionality, e.g. Google Drive and Dropbox
    • Using dedicated version control tools, e.g. Git anf GitHub

(http://phdcomics.com/comics/archive.php?comicid=1323)

The first version control systems were created by groups writing software and code. Fortunately, they can now be used not only by computer scientists (for developing computer code) but by anyone (for any type of file) 😄

There are two types of version control systems:

(adapted from http://lhzuigao.com/309note.html)

Advantages of distributed (right) over centralised (left) version control systems include:

  • If the central repository (server) crashes, it could be recovered / backed up from any of the local repositories created e.g. by the researcher, collaborator or group leader.
  • Each person can make changes to their local repositories offline. Then integrate their individual changes in the central repository (server) when connected online.

Git is a distributed version control system to keep track and compare the history of changes made to your scripts and files. It allows groups of people to work on the same documents at the same time, and without stepping on each other's toes. It was created by Linus Torvalds in 2005 for the development of the Linux project. It is free and open source and helps you with:

  • Creating repositories to host your projects using the command-line
  • Tracking changes to the files and folders within your repositories

GitHub is an online platform to share and showcase your work with collaborators and the wider audience. A tool to help you build projects that are collaborative, well documented, and version-controlled. It provides you with:

  • A place to host and backup your repositories online
  • A nice web interface to your repositories
  • A strategy to collaborate with colleagues

Versions in Git and GitHub are identified by a revision number, e.g. 60363b1, also known as commit. Each revision is associated with a timestamp and the person making the change. Revisions can be compared, restored, and with some types of files, merged.

There are other softwares for version control similar to Git, e.g. svn. Similarly, there are other online platforms similar to GitHub to share and collaborate code, e.g. GitLab.

How can you use GitHub? How can it be useful for your work?

There are two interfaces to GitHub:

  • Github Desktop (available for Mac and Windows)

(https://programminghistorian.org/lessons/getting-started-with-github-desktop)

Today, we will be using GitHub's online interface.

Examples

In the context of our research group

  • Communication is key as most projects have both experimental and computational leaders

  • Building from the classical ways of sharing - conversations/meetings, email, Dropbox, shared folders ... we want to build an environment where:

    • Computational colleagues can share code, figures and tables. Review others work and get credit from their collaborative work
    • Experimental colleagues can follow computational developments, access results and learn methods of data analysis
  • And ideally avoiding situations like ...

(http://phdcomics.com/comics.php?f=1689)

  • Parhaps a happier lifetime for a research project:

(https://github.com/semacu/20170703_GitHubintheLab_CRUK-CI)

Public (free) and private repositories

If you want to start creating repositories in GitHub, your first need to open an account:

  • Public repositories are free, and can be browsed and downloaded by anyone
  • Private repositories have associated costs depending on the number of collaborators - see pricing of plans. The individual pro and team plans cost $7/month and $9/month respectively but they are free if you are a student or an academic (institution).

Alternatively, GitLab uses a different business strategy with free private repositories and cost plans for public ones. There are also other alternatives e.g. Bitbucket.

Markdown

  • GitHub uses Markdown for text edition, a language with plain text formatting syntax (bold, italics, checkboxes, lists, etc.), to render pages online (like HTML but easier). You can use this syntax in text files (file extension: .md), commit messages, issues, blog posts, and more.

  • Markdown is important because GitHub automatically renders anything written in Markdown. This can be specific files (eg: README), or your comments and issues.

  • Some examples of Markdown syntax are available here.

Practical session: working with GitHub

We have several tutorials:

Create a GitHub account

If you don't have a GitHub account already:

  • Go to https://github.com
  • Fill in your Username, Email and Password. Then click on the green button "Sign up for GitHub".

  • Choose your personal plan page. Select "Free plan" and then click on "Continue".

  • Tailor your experience page. Choose the boxes that apply to you and click on "Submit". Otherwise, just go to "skip this step".

  • You have created a GitHub account! 😄

Create your first repository

  • If you are not already signed in, sign in to GitHub using the Username/Email and Password created before.
  • Click on the top-right "avatar icon" and select "Your profile". Have a quick browse through your page.

  • Click on the top-right "+" icon and select "New repository". Verify your email address. You should have just received an email from GitHub in the address provided before. Find this email and click on "Verify email address".

  • Create a new repository page. Fill in a "Repository name", e.g. "my_first_repository" or "my_analysis_script". Write a short description of your repository e.g. "This is a test repository". For now choose "Public" and select the box to initialize this repository with a README. Finally, click on "Create repository".

  • You created your first repository! 🚀

Explore your GitHub account and first repository

Your GitHub account

  • Click on your top-right "avatar" icon and select "Settings".

  • Explore the tabs "Profile", "Account" and "Emails".

Your first repository

  • Click on README.md and go to the right pencil "Edit this file". Type anything to change the file, e.g. "GitHub is fun!".

  • Scroll down. Introduce a commit change message, e.g. "My first update", and select the radio button "Commit directly to the master branch". Then click on "Commit changes". Voilá!

  • To view your history of commits for README.md, click on README.md and then on the "History" button on the right.
  • Alternatively, to view your history of commits for your first repository, click on the name of your repository and select the tab depicting a small clock and the number of commits next to it.

Bonus points:

  • Try to create a second new file and add some content to it
  • In your new repository, have a look at the "Settings" tab, explore "Collaborators" and try adding one of your colleagues

Key glossary:

  • Repository: it can be thought of as a project folder. A repository contains all of the project files, issues, wikis and more. It also stores the history and versions of each file.

  • Commit: equivalent to saving your changes to a file. When you commit you usually include a brief description of the changes you made so you can identify versions later if you want to undo a change.

  • Branch: an identical copy of a project at a particular point in time kept separate from the 'master' branch (primary copy). This keeps your code in the 'master' branch safe while you make changes and experiment with code on the new branch. You can merge your new branch back into the 'master' branch when you want to publish your changes.

  • Master: the default branch in your repository.

  • Collaborator: someone with read and write privileges to a repository as approved by the repository owner.


Create an issue and a branch

(https://buildazure.com/introduction-to-git-version-control-workflow/)

Follow steps 5-8 in the following page


My first pull request

(https://www.dataschool.io/simple-guide-to-forks-in-github-and-git/)

Follow steps 1-7 in the following page. No need to complete the Stretch goal.


Additional tutorials

If you are more interested, try the following later:

The End

Many Thanks for your attention! Enjoy Git and GitHub! :octocat:

Any questions/suggestions about this workshop or the materials? Just email me at: sermarcue@gmail.com

References and materials

Blogs:

Books:

Courses:

Help:

Papers:

Videos:

Websites:

Acknowledgements

You can’t perform that action at this time.