<img src="https://pbs.twimg.com/profile_images/1092394418135539713/eplLRcDN_400x400.jpg" width=80px style="text-align:right"><h1>The Internet of Production Alliance </h1>

## Data collection program for the [OKW, Map of facilities](https://www.internetofproduction.org/open-know-where)


Author: Antonio de Jesus Anaya Hernandez, DevOps eng. for the IoPA.

Author: The internet of Production Alliance, 2023.

Data was collected by "Offene Werkstaetten, and its partners", URL location: https://www.offene-werkstaetten.org/de/werkstatt-suche

The Open Know Where (OKW) Initiative is part of the Internet of Production Alliance and its members.

License: CC BY SA

![CC BY SA](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/by-sa.svg)

Description: Python code for downloading, parsing, filtering, sorting data, exporting the RAW FabLabs, and the processed IOPA data as CSV.

In [1]:
# This line installs the required libraries for running the script, uncomment the line:
# !pip install -r requirements.txt

In [2]:
import requests, json, re, time
import pandas as pd

In [3]:
from datetime import datetime
now = datetime.now()

In [4]:
from bs4 import BeautifulSoup as soup

In [190]:
def req_data(url):
    
    response = requests.get(url)
    

    print(response.status_code)
    
    if response.status_code == 200:
        time.sleep(2)
        print(url)
        return response
    else:
        print("Error response: Check URL or internet avalability, and Try again.")
        print(url)
        return req_data(url)

In [6]:
url = "https://www.offene-werkstaetten.org/widgets/search?colorA=74ac61&colorB=0489B1&customMarkerSrc=https://cdn0.iconfinder.com/data/icons/map-location-solid-style/91/Map_-_Location_Solid_Style_06-48.png&customClusterSrc=https://cdn4.iconfinder.com/data/icons/ionicons/512/icon-ios7-circle-filled-48.png"
data = [x.text for x in soup(req_data(url).text, 'html.parser').find_all('script') if 'vow.Map' in x.text][-1]

200
https://www.offene-werkstaetten.org/widgets/search?colorA=74ac61&colorB=0489B1&customMarkerSrc=https://cdn0.iconfinder.com/data/icons/map-location-solid-style/91/Map_-_Location_Solid_Style_06-48.png&customClusterSrc=https://cdn4.iconfinder.com/data/icons/ionicons/512/icon-ios7-circle-filled-48.png


In [7]:
data_f = '[{"' + re.findall(r'\[{"(.*?)"\}\]\,', data)[0] + '"}]'
data_json = json.loads(data_f)

In [8]:
input_ = pd.DataFrame(data_json)

In [9]:
input_.reset_index(drop=True, inplace=True)

In [10]:
input_.to_csv('raw_offene_input_' + now.strftime("%Y_%m_%d_%H%M") + '.csv')

In [11]:
input_.columns.tolist()

['name',
 'img',
 'uid',
 'url',
 'lat',
 'lng',
 'street',
 'zip',
 'city',
 'web',
 'country',
 'cats',
 'aai',
 'icm',
 'street_nr']

In [12]:
transform = input_.rename(columns={'uid': 'offene_id', 'lat': 'latitude', 'lng': 'longitude', 'zip':'postal_code'})

In [13]:
transform['offene_url'] = 'https://www.offene-werkstaetten.org/werkstatt/' + transform.url

In [14]:
transform['address'] = transform.street.astype(str) + ', ' + transform.street_nr + ', ' + transform.aai

In [188]:
def decrypt(js):
    if js != None:
        # print(js)
        a_cut = js[17:104].replace('\\', '')
        c_cut = js[123:181].replace('\\', '')
                                    
        #print(a_cut)
        #print(c_cut)
        try: 
            a = re.search(re.compile(r'var a="(.*?)";'), a_cut).group(1)
            c = re.search(re.compile(r'var c="(.*?)";'), c_cut).group(1)
        except AttributeError:
            return None
        #print(a)
        #print(c)
        b = ''.join(sorted(a))
        d = ''

        for e in c:
            d += b[a.index(e)]
        #print(d)
        return d
    else:
        print("offline?")
        return None


In [191]:
transform['contact_email']  = transform.offene_url.apply(lambda x: decrypt(soup(req_data(x).content, 'html.parser').find('span', text=re.compile(r'javascript protected email address')).find_next_sibling('script').text))

200
https://www.offene-werkstaetten.org/werkstatt/rosenwerk
200
https://www.offene-werkstaetten.org/werkstatt/35-services-e-v-offene-werkstatt
200
https://www.offene-werkstaetten.org/werkstatt/360-raum-fuer-kreativitaet
200
https://www.offene-werkstaetten.org/werkstatt/3d-druckzentrum-ruhr
200
https://www.offene-werkstaetten.org/werkstatt/3d-repaircafe-de
200
https://www.offene-werkstaetten.org/werkstatt/4830
200
https://www.offene-werkstaetten.org/werkstatt/abantu-kulturlabor
200
https://www.offene-werkstaetten.org/werkstatt/akademie-fuer-suffizienz
200
https://www.offene-werkstaetten.org/werkstatt/aktivitetshuset
200
https://www.offene-werkstaetten.org/werkstatt/allweshape
200
https://www.offene-werkstaetten.org/werkstatt/alte-giesserei-berlin-e-v
200
https://www.offene-werkstaetten.org/werkstatt/offene-werkstatt
200
https://www.offene-werkstaetten.org/werkstatt/asta-fahrradwerkstatt-e-v
200
https://www.offene-werkstaetten.org/werkstatt/atelier-fuer-keramik
200
https://www.offene-wer

In [192]:
output = transform.drop(columns=['img', 'street', 'street_nr', 'aai', 'cats', 'url', 'icm', 'web'])

In [193]:
output.to_csv('iopa_offene_output_' + now.strftime("%Y_%m_%d_%H%M") + '.csv')

In [194]:
print("OKW entries: {r[0]}, columns = {r[1]}".format(r=output.shape))

OKW entries: 482, columns = 10
