Skip to content

Python module that converts Overleaf's project histories to Git repositories.

License

Notifications You must be signed in to change notification settings

zambonin/overleaf-to-git

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A Python module that translates Overleaf's project history into Git
repositories. It uses RoboBrowser to navigate around the site and obtain the
correct diffs.

The module will create a temporary directory with the project file tree and Git
commit history. To convert Overleaf projects into Git repositories, simply use
the command `python -m overleaf_to_git`. The module requires the user to be
logged in a real browser, currently Firefox but easily configurable to be
another browser supported by `browser_cookie3`. One may choose multiple
projects to be converted. The module will show progress as it downloads
information and creates commits.

The whole ordeal takes some time to finish, since every revision of every file
inside the projects is downloaded from Overleaf's servers. However, the raw
files are stored in the operating system's temporary directory to allow for
resumed execution at no cost to Overleaf's servers if anything goes wrong.

Binary files are obtained via the download of old versions as ZIP files, with
care taken so that the minimal number of ZIP files are downloaded. Furthermore,
files opened on Overleaf may be marked as modified even though there are no
apparent modifications. This is likely the result of a user rapidly adding and
removing content, and will be shown on the commit message.

A tool such as `git-filter-repo` may be used to comb through the newly created
repository and modify it to taste. The aim of this module is to import as much
content as possible, even if it seems redundant.

DISCLAIMER: this project may not adhere to Overleaf's Acceptable Use Policy
under the notion of "scraping". Nonetheless, the author claims that Overleaf
does not provide a "publicly supported interface" to archive your project
history, even to paid users. (This project partially simulates a user clicking
through all revisions in the project's "history" tab, and saving the output
into a HAR file, using the network monitor of a common browser.)

About

Python module that converts Overleaf's project histories to Git repositories.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages