-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the city's flag and coat of arms. #66
Comments
I would like to help with this task as well. I will do something later today. Wanna try convincing ChatGPT to help with extracting the image data from wikipedia. Did some attempts yesterday with provincies as well, but he ended up distracted all the time and was returning wrong URLs. |
@AloisSeckar That would be very cool cause there are around 8000 cities. I have added a couple but seems like a long task to do by "hand". |
So I am trying, but it is not very effecive right now. It says it has to fetch each image separately and often asks for permission to proceed. It is working, but it is slow. However, I have learned a few things we might use to create some "import script":
UPDATE: The linked list of flag images for cities in |
I made a first version of custom web crawler to get the actual Wiki image URLs - https://github.com/AloisSeckar/wiki-image-crawler So far it "only" retrieves the list of image URLs from Wiki category pages (example), but unlike ChatGPT, it does it quickly. I will try to improve it soon, so it will be able to fill the retrieved data directly to |
Simple python script to list all the images: # List of flags of municipalities:
# https://commons.wikimedia.org/wiki/Category:SVG_flags_of_municipalities_of_Spain_by_province
# List of coats of arms of municipalities:
# https://commons.wikimedia.org/wiki/Category:SVG_coats_of_arms_of_municipalities_of_Spain_by_province
import os
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
# URL of the Wikimedia Commons category
url = "https://commons.wikimedia.org/wiki/Category:SVG_coats_of_arms_of_municipalities_of_La_Rioja_(Spain)"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
with open("output.txt", "w", encoding="utf-8") as file:
gallery_boxes = soup.find_all('li', class_='gallerybox')
for gallery_box in gallery_boxes:
relative_image_url = gallery_box.find('img')['src']
image_url = urljoin(url, relative_image_url.replace('/thumb/', '/'))
image_url = os.path.dirname(image_url)
file_name = gallery_box.find('a', class_='galleryfilename')['title']
gallery_text = gallery_box.find('div', class_='gallerytext').text.strip()
file.write(f"Image URL: {image_url}\n")
file.write(f"File Name: {file_name}\n")
file.write("\n")
|
We have to add the city's flag and coat of arms.
If someone want to help just go to cities.json and look for cities with those attributes to null.
The text was updated successfully, but these errors were encountered: