A miniature Git command-line interface (CLI) clone written in Python. This project is intended for educational purposes, providing insight into the core principles of Git, including object hashing, repository initialization, file staging, and common version control operations.
- init: Initializes a new, empty Git repository.
- hash-object: Computes the SHA-1 hash of a file's content and, optionally, writes it as a 'blob' object to the Git database.
- cat-file: Displays either the content or type of a Git object, given its corresponding hash.
- ls-tree: Enumerates the contents of a 'tree' object, with an option for recursive listing.
- add: Stages file contents by adding them to the index. It accommodates single files, multiple files, and the addition of all files within a specified directory.
- ls-files: Displays the files currently residing within the index.
- write-tree: Generates a 'tree' object from the current index and returns its hash.
- commit: Records the staged changes as a new 'commit' object.
- clone: Clones a remote repository, fetching objects, unpacking the packfile, and checking out the most recent commit.
- push: Pushes local commits to a remote repository by identifying new objects, compiling a packfile, and sending it to the remote server.
- status: Provides a summary of the working directory's status, including staged, modified, and untracked files.
- Python version 3.6 or higher.
- Standard Python libraries are used, including zlib, hashlib, argparse, pathlib, time, struct, re, urllib.request, logging, os, configparser, and getpass.
To begin, one must clone this repository to their local machine:
git clone https://github.com/pypros/mini-git-python.git &&
cd mini-git-python
The script is executed directly from the command line by invoking:
python3 git.py <command> [options]
python3 git.py init
This action generates a new .git directory within the current folder.
- A file, such as README.md, should be created.
- The file is then added to the staging area:
python3 git.py add README.md
- The status of the changes can be verified as follows:
python3 git.py status
- Finally, the changes are committed with an accompanying message:
python3 git.py commit -m "Initial commit"
One can inspect the objects created by the Git system.
- To hash a file without writing the object:
python3 git.py hash-object your_file.txt
- To inspect a 'commit', 'tree', or 'blob' object:
python3 git.py cat-file -p <object_hash>
python3 git.py clone https://github.com/some-user/some-repo.git
This command downloads the repository's objects and checks out the most recent commit.
python3 git.py push https://github.com/some-user/some-repo.git
or
python3 git.py push
This command pushes the local main branch to the remote repository. It should be noted that the current implementation of this function requires authentication to be handled through environment variables or configuration files.
- git.py: The principal script encompassing all Git command implementations.
- .git/: The directory created by the init command, which contains the object database, the index, and references.
- objects/: A directory for storing all compressed Git objects, including blobs, trees, and commits.
- refs/: A directory that stores references to commit hashes, such as branch and tag pointers.
- index: The staging area, represented by a binary file that contains a list of files to be included in the next commit.
This project is open-source and is made available under the MIT License.
To enhance the project's maintainability and robustness, the following steps are recommended:
-
Refactoring: The codebase should be refactored into smaller, more modular functions. This will improve code readability, reduce complexity, and facilitate easier debugging. The separation of concerns will also make individual components more reusable.
-
Testing: Comprehensive unit tests should be developed for all core functionalities. This will ensure that changes to one part of the code do not introduce regressions elsewhere and will confirm that each component behaves as expected under various conditions. A robust test suite is essential for long-term stability and continued development.