extract GitHub links from text files.
This project helps you extract all GitHub links from a large text file into a separate file. The script filters lines starting with https://github.com/ and outputs them to a new file, making it easy to isolate and manage relevant links.
- Extracts valid GitHub repository links (e.g.,
https://github.com/user/repo) from a text file. - Handles large files efficiently.
- Supports customizable input and output filenames.
- Python: Make sure Python 3.x is installed on your system.
- Check your Python version:
python --version
- Check your Python version:
-
Prepare Your Environment:
- Create or locate a folder where you'll store the script and files.
- Ensure the text file you want to process is in the same folder. For example, name it
input.txt.
-
Create the Python Script:
- Open a text editor (e.g., Notepad, VS Code, or Sublime Text).
- Copy and paste the script from the
extract_github_links.pyfile in this repository. - Save the file in the folder as
extract_github_links.py.
-
Run the Script:
- Open your terminal or command prompt.
- Navigate to the folder where you saved the script:
cd path/to/your/folder - Run the script using Python:
python extract_github_links.py
-
Check the Output:
- After running the script, a new file named
github_links.txtwill be created in the same folder. - This file will contain all the extracted GitHub links, one per line.
- After running the script, a new file named
Some random text
https://github.com/user/repo1
Another line of text
https://github.com/org/repo2
More random text
https://github.com/user/repo1
https://github.com/org/repo2
-
Encoding Issues: If you encounter errors related to character encoding (e.g.,
UnicodeDecodeError), ensure your input file is encoded in UTF-8. If the issue persists, adjust the script to use a different encoding (e.g.,latin-1orcp1252). -
File Not Found: Ensure
input.txtexists in the same directory as the script.
Feel free to suggest improvements or report bugs via issues!