In [6]:
#imports for code
import pandas as pd
import networkx as nx
import json
import nx_altair as nxa
import altair as alt

# Fifty Years of Technocracy. International Migration Management as an Evolving Discourse Coalition.

- Marijke van Faassen - Huygens ING (marijke.van.faassen@huygens.knaw.nl)

- Rik Hoekstra - Huygens ING (rik.hoekstra@di.huc.knaw.nl)

- Marijn Koolen - Huygens ING (marijn.koolen@di.huc.knaw.nl)

## Introduction

In 1963 the American demographer and sociologist William Petersen wrote a review article on several books on refugee studies. Except for some studies on the German refugee problem, at that time still an important issue due to the millions of post-World War II ethnic and non-ethnic German expellees from Central and East Europe on German soil, Petersen also reviewed the first issue of a new journal called International Migration: Quarterly review on the role of migratory movements in the contemporary world (IM). The journal was ‘sponsored’ by the so called Intergovernmental Committee for European Migration (ICEM) and was in fact a merger between two journals: ICEM’s own journal Migration and the bulletin of the so called Research Group for European Migration Problems (REMP), the REMP-bulletin. Petersen praised the high scientific quality of one of the REMP-monographs he also reviewed, but he was extremely critical about this first journal issue. He not just saw the efforts of the Intergovernmental Committee for European Migration to assist migrants to move from Europe as a mere duplication of the purposes of the in 1952 established UN High Commissioner for Refugees, he also questioned the character of the newly established journal: ‘If the first number of International Migration is a good indication, the shift from the REMP Bulletin has been one from a scholarly journal to a self-aggrandizing house organ. […] Most of the articles are by agency officials rather than scholars.’ (Petersen 1963, 419-421).

The Intergovernmental Committee for European Migration, nowadays better known under its current name International Organisation for Migration (IOM) is one of the bigger players in the field of regulating global migration, albeit not an undisputed one. Nowadays, In particular, IOM’s ‘voluntary’ return programs and IOM’s participation in Australia’s refugee policy in the Pacific are subject to criticism (Andrijasevic and Walters 2010; Fleay and Hoffman 2014). Since the establishment of the Provisional Intergovernmental Committee for the Migration from Europe (PICMME), the predecessor of both ICEM and IOM, scholars have been struggling to characterize this organization, which framed itself as a  purely technical post-war refugee transport organization which inherited the fleet and the budget of the temporary UN International Refugee Organization (IRO) after its expiration in 1951, but in fact was an organization, born out of a combination of the post war  ‘one world’ ideology and Cold War reality, with highly political aims, focusing especially on stimulating migration from Europe of labour surpluses in the first decennia of its existence (Van Faassen 2014, Ch.3; Parsanoglu 2015).[1] Its twofold objectives and its non-permanent status till the end of the 1980’s was probably the cause of being understudied by both refugee studies and international migration studies for almost forty years. This changed when the scholarly debate on international migration underwent a paradigm shift to the study of migration management and researchers discovered the new global ambitions of ICEM in this field. These ambitions found its expression in a new constitution and two name changes, first the skipping the ‘Europe-part’ (ICM 1980) followed by the shift from ‘intergovernmental committee’ to the more inclusive ‘International organization’ governance system’ (IOM 1987),  and obtaining a permanent status (Perruchoud 1987). Meanwhile, a lot of critical studies on ICEM/IOM have been published. Although the genesis and early history of the 1950’s is well known by now, most of them pay little or no attention  to the second decade of its existence, the 1960’s, thus neglecting the merger of the journals of ICEM and the till now further unnoticed  REMP research group and the critical notes on ICEM itself and its self representation for a wider public, that William Petersen put his finger on in his review.

In fact, the changes Petersen noticed coincide with the start of a new Director-General of ICEM, Dutchman Bastiaan Wouter Haveman (1908-1979), who led ICEM between 1962-1969 as the only non-American Director-General till 2018. The social-democratic chemical engineer and lawyer Haveman had been appointed Dutch Government Commissioner for Emigration in November 1950 as the Dutch Government based its socioeconomic policy on two cornerstones. In order to prevent structural unemployment, as had been the case in the 1930s world economic crises, large-scale emigration to overseas destinations was to be encouraged, in addition to industrialization (Van Faassen 2014, 2017). As Haveman was of the opinion that migration policy had to be science-based and planning was only possible if he had an insight into the migration potential of the Dutch population, he established close connections with social scientists and demographers. Haveman himself and his ministerial emigration department closely collaborated with the German political economist and demographer Gunther Beyer, who was praised in Petersen’s review as the one who ‘created and sustained the Research Group for European Migration Problems’ and who ‘achieved an enviable reputation among demographers’ (Petersen 1963, 420). Beyer was also known for his skills in ‘ enabling a dialogue between politicians and demographers possible’ (Van de Kaa 1983, 3). The research group REMP was established in 1952 in the Hague, with Beyer as secretary/editor (till his death in 1983) and a provisional board largely consisting of Dutch social scientists who together guided  and stimulated the migration research commissioned by the Dutch government.[2] At the same time, Haveman, in close collaboration with the Australian and United States ICEM-delegates at certain moments played a pivotal role in ICEM between 1959 and 1961, which resulted in his appointment on 1 January 1962 as the first non-US ICEM Director (Van Faassen 2017).

            
Given the close professional connections between Haveman and Beyer in the early 1950s and the fact that in the 1960s REMP and ICEM (with Haveman as new director-general) at least profiled themselves via the same journal, it is highly plausible to assume that in order to really understand the nature of ICEMs policies and its technocratic character (Geiger 2018, 39) it is necessary to further study the relation between REMP and ICEM.  We argue that especially the constitutive years of ICEM are key to understanding how and which actors from both science and politics were able to be of influence on ICEM’s decision-making processes. By using the concept of discourse coalitions – defined by Wagner (as cited in Raphael 2012) as constellations at any given time in which social scientists develop ideas ‘that strengthen the arguments of a group of actors in the political system, whose policies might, in turn, support the standing of these scientists in academia’ –  we are able to lay bare networks of technocrats (on basis of their publications) shaping migration management interactively with national governments. We asked ourselves: can Petersen’s observation based on the first issue of International Migration, be substantiated through time and what does it tell us about the character of ICEM? Can we see REMP as the discourse coalition on migration management? If so, how did the discourse coalition evolve into an international version and can we actually see developments in discourse (a discursive development)? As neither REMP nor ICEM left many open access archival collections, we also had to address some methodological issues: how can we investigate these questions and make it plausible? What resources do we need? Is it possible to conduct this type of research based on long-running journals like IM by using the information that is publicly available onlineonly using their series of metadata and title content? How do we connect analog and digital sources?

In the following sections we first explore the concept of discourse coalitions <and technocracy> and will compare the most recent studies on ICEM with previous archival research on REMP, ICEM and their connections with the Dutch emigration governance system. Following this, we present our digital sources, discuss missing links in some of them and present the methodological translation we had to make to answer our main research questions. Finally we analyse and explain our results <for  10 years windows in order to make possible shifts in the discourse networks and the discourse itself visible> and address the robustness of our choices.

## REMP, ICEM and the Dutch connection

Technocracy tends to present itself as ‘above politics’, but in the last decades scholars have been actively demystifying the technocratic conception of policy analysis as neutral ‘science’, thus bringing to the surface the politics that is inherent in it. (Fischer 1995; Nord 2010; Habermas 2015). Technocracy is often defined as controlling or replacing democratic deliberation, which is usually based on finding a compromise between conflicting interests,  with a more neutral sounding science-based governance discourse where political issues more or less are transformed into undisputed, technically defined ends that can be pursued through administrative means. Although Lutz (2012) does not focus on ‘technocracy’ as such he argues that discourse coalitions originate in the desire to make scientifically informed policy. Therefore, his suggestion to study the impact of ‘expertise’ by a combined analysis of both the networks of actors from different backgrounds in academia, administration and political parties and the networks of discourses they produce is in fact a helpful tool to unravel the technocratic character of certain issues in politics, because it lays bare how the concepts and arguments of the social and human sciences got linked to the political domain and vice-versa. Thus, Lutz argues, using the concept of discourse coalitions requires  elementary prosopographic evidence (whenever available), but also a keen eye for the actual genesis of possible expert groups. Below, we first discuss the founding years of the Dutch emigration governance system, REMP and especially the predecessor of ICEM, the Provisional Intergovernmental Committee for the Movement of Migrants from Europe (PICMME). By using Lutz’ typology of the configuration of discourses and metaphors used in those periods we study the initial connections between social science expertise and international migration. This analysis gives rise to conclusions and questions about the relation between REMP and ICEM and to sub questions to select our sources and construct our datasets to answer our main research questions. 

The merger in 1961 between the REMP-bulletin and ICEMs magazine that William Petersen reviewed coincided with the start of the Dutchman Bas Haveman as the new Director-General of ICEM. Haveman was a social-democratic chemical engineer and lawyer and had been appointed Dutch Government Commissioner for Emigration in November 1950. But both Haveman and REMP already had a history, that is important to understand the genesis of REMP. The roots of the Dutch migration policies lie at the end of the 19th and the first half of the 20th century, periods that are characterized by discourses on ‘social reform’, targeted on the ‘working poor’  (1880-1910) and ‘social engineering’ (1910-1940). According to Lutz (2012), especially after World War I the interventions were grounded in social science research and were based on arguments concerning ‘fear of degeneration’ and - more positively formulated - ‘national recovery’, with discourses around ‘demography’ , ‘community’ or ‘eugenics'. The Dutch migration policies fit well into Lutz’ characterization of the impact of expert-influence in the Western societies in general. 

Although today’s polarized debates on economic migration seem to suggest otherwise, for more than a century states have considered (international) migration – and thus the regulation of it - as a sound instrument for the (re)allocation of labour. In 1913, the Netherlands Association for Resettlement (landverhuizing)  was founded, as a hybrid private-public organization. Amongst its initiators were the liberal Minister of Agriculture Willem Treub and an engineer from Delft Polytechnic University, Isaac Pieter de Vooys. Both were still working on their final report by the State Commission on Unemployment, in which they explored internal and international migration as possible solutions (Van Faassen 2014).  De Vooys was also part of an intellectual movement that has been described as synthetic technocracy, that was looking for  a synthesis that would bridge ideological differences and thus stop the fragmentation of Dutch society which was divided along religious lines (so called pillarization) at that time, and that paralyzed Dutch political decision-making in their view. 

During the interbellum their quest became increasingly connected with proposals for educational reform (Baneke 2011, 91)  and the movement began to coincide with the educational program of the socially engaged Professor Sebald Steinmetz. He developed a new research style called ‘sociography’ (the predecessor of sociology) focussing on empirical research on Dutch social groups and communities in relation to different people internationally. One of his students initiated the so-called People’s High School Movement in the Netherlands. This had comparable aims to the technocrats and attracted other Dutch  sociographers, like Hofstee and Bouman who later also played a role in REMP. During the world wide economic crisis of  the 1930s the Friesland High School organized combined employment and emigration courses, thus creating the possibilities for demography-based politics. One of the teachers was the Delft engineer Bas Haveman, who had previously been the secretary of the “Stichting Nederlands Volkskracht” (Dutch People's Power Foundation) that aimed at moral and physical  training of unemployed youth, to prevent ‘moral decay’ and to stimulate ‘community spirit’ (gemeenschapszin) especially to bridge the then perceived gaps in mentality between city and countryside (Van Faassen, 2014, Van Faassen and Hoekstra 2017).

After World War II social engineering did not disappear, but emanating in the USA and the UK, it became instrumental for establishing a peaceful and above all a ‘planned modernization’, in which welfare states could be extended, according to Lutz (2012). Metaphors for this modernization discourse were ‘assimilation’ and 'adaptation’. Again the Netherlands were no exception to the general picture. Already during the war the Dutch government had started to investigate possibilities for a so-called active migration policy, emanating from a Keynesian aspiration for full employment. It based its post-World-War II socioeconomic policy on two cornerstones. In order to prevent structural unemployment, industrialization was to be encouraged. In addition, large-scale family emigration to overseas destinations was stimulated and facilitated. This not only enabled the Dutch government to manipulate the supply side of the labour market, but also to avoid at that time still unwanted demographic solutions like birth control in the still religious Netherlands, which had the highest increase in population forecast till the 1960s, due to its sky high birth rates (UN Population forecast 1951; Van Faassen 2014, Van Faassen and Hoekstra 2017). Not surprisingly, when Haveman was appointed Government Commissioner for Emigration in 1950 he believed that migration policy had to be science-based and planning was only possible if he had an insight into the migration potential of the Dutch population. 

Therefore he established close connections between his Commissioner’s Office of Emigration, based at the ministry of Social Affairs and social scientists and demographers, most of them from his former High School networks. He also established new acquaintances. Amongst them was the German political economist and demographer Gunther Beyer, who fled to the Netherlands in <1935>. Beyer was praised in Petersen’s review as the one who ‘created and sustained the Research Group for European Migration Problems’ - he was editor of the REMP-publication series till his death in 1983 - and who ‘achieved an enviable reputation among demographers’ (Petersen 1963, 420). According to his biographer, Beyer was also known for his skills in ‘facilitating a dialogue between politicians and demographers’ (Van de Kaa 1983, 3). Haveman met Gunther Beyer when they both were introduced to the American Delegation during its stopover in The Hague, just before the start of the founding Conference of PICMME in Brussels, Belgium in november 1951, and only a few months before REMP was established in March 1952 (IISG, collection Beyer, inv.nr. 33).

In his latest analysis of IOM, Geiger (2018) does not connect   the founding of IOM’s predecessor PICMME with the history of the International Labour organization (ILO), the International Refugee Organization (IRO) and the United Nations High Commissioner on Refugees (UNHCR), although it was part of older ILO-studies (Alcock 1971). This interconnectedness is of importance for recognizing some of the technocratic arguments that accompanied the establishment of PICMME/ICEM, to better understand its longtime existence outside the UN as well as the Dutch influence in ICEM. The Netherlands took an active part in the international emigration debate after World War II, in which every country had its own agenda, while humanitarian and economic arguments concerning manpower became inseparably linked to arguments on collective security due to quickly deteriorating East-West relations.  Another element of Haveman’s job was strengthening international contacts. He acquired his first international experience as a social-economic expert in the Dutch delegation to the United Nations (UN) as a member of the Special Committee on Refugees and Displaced Persons, that  set up the International Refugee Organization (IRO) in 1946, a temporary UN-agency in which the Eastern (communist) European countries decided not to participate. In this role Haveman worked closely with United States (US) delegate George L. Warren, who was the State Department’s advisor on refugees and later became the head of the American delegation to ICEM till the mid-1960s (Van Faassen 2014, 2017).

In the early 1950s, it was unclear which international institutions would regulate and be able to finance refugee and migrant flows. By the time the mandate of the IRO had expired at  the end of the 1940s the refugee problem had still not been resolved. The UN subsequently decided to establish the Office of the UN High Commissioner for Refugees (UNHCR), but with only a small administrative budget and at first without an operational mandate  (Salomon 1991, Zieck 1997). Due to Cold War considerations and the ‘white’ migration policies of Canada and Australia, the Western countries were also looking for a continuation of the IRO services outside of the UN system with its communist and ‘non-white’ member states. The International Labour Organization tried to enhance its position by arguing that migration traditionally was a ‘labour-’ issue and therefore should ressort under ILO. In 1951, Haveman convinced the Dutch government that ILO should not emerge as winner of the battle over the IRO’s material legacy (the fleet, personnel and an administrative budget augmented by a US bonus of ten million dollars), even though the current social-democratic Dutch government initially favoured the ILO option because of the tripartite composition of its governing and plenary bodies with union and employers delegates. Haveman informed the Dutch government that during his travels to Canada and Australia, he had experienced resistance to ILO interference due to the non-Western input that was difficult to combine with the de facto white migration policy of both countries. Haveman had also been in touch with Warren and learned that the US Congress would not finance any operational ILO migration work due to strong anti-ILO forces in US domestic politics. As a result the Netherlands supported the solution,  of allocating the IRO legacy for one year to the PICCME, which was instigated and presented by US Congressman Francis E. Walter at the conference in Brussels in November 1951.  

PICMME presented itself as purely a transport organization for refugees and displaced persons, temporarily offering technical solutions for what was publicly stressed as a humanitarian problem. However, internally the member states had agreed that its main function was to encourage and facilitate the economic migration of labour surplus from a disrupted Europe, giving  this ‘depoliticized’ agreement in fact a highly political background. The preference for bilateral contacts on migration remained, while at the same time ICEM facilitated multilateral discussions without asking for too much collective commitment from the member states because of its temporary status outside the UN-system. Therefore, in the long run it could even function as a safety valve for an integrating Europe: the moment the free movement of people within Europe actually would be regulated  - in which Germany and the Netherlands would likely become the ‘receiving’ countries - this inflow could be compensated by facilitating the overseas emigration of their own nationals using PICMME / ICEM facilities  (Van Faassen 2014, Parsanoglu 2015). One of the advantages of this ‘light’ construction was that PICMME’s and from 1953 onwards  ICEM’s Executive Board, on which the Netherlands had an almost permanent seat until the 1970s, was only accountable to its own Council and thus could stay under the radar of the national parliaments. 

The day in november 1951 that Gunther Beyer met Haveman at the US embassy in the Hague, Beyer was invited to inform Congressman Walter and the other American delegates about the plans he had developed since May 1950 to establish an international research group on European migration and refugee problems. This group, which would eventually become the Research Group of European Migration Problems (REMP) in 1952, initially was a bilateral Dutch-German initiative. Gunther Beyer at that moment still was the only formal Dutch representative together with the German Bildungsökonom (education economist) Friedrich Edding, who worked at the Institut für Weltwirtschaft in Kiel. The aims of the research group was to find solutions for the fact that - in their analysis - the relations between population density, the age distribution, the labour potential and the national earning capacity had become so stressed in the second half of the 1940s, that there was a serious risk of social unrest that could endanger prosperity and that could only partially be solved on a national basis. As these Dutch and German researchers were ‘convinced that the integration of the national economies must be accompanied by free migration of labour’, they decided to join forces in order to investigate possible migration problems within Europe. The first studies were published in the spring of 1951 and offered to the US-delegation during the november meeting. Beyer stressed that the idea was to extend this research group ‘by inviting personalities from all nations, who are ready and able, by their scientific qualifications and practical experience, to co-operate on a basis free of national, party political and confessional bias’ (all citations IISG, collection Beyer, inv.nr.33). 

In March 1952 the Research Group formally was established in The Hague. Beyer was appointed secretary and editor-in-chief. The Dutch sociographer Pieter Bouman, professor at Groningen University and personal friend of Government Commissioner Haveman, had been appointed to form the first board of directors. The provisional managing committee consisted of the Utrecht professor and sociographer Sjoerd Groenman and the catholic economist and sociographer professor George Zeegers. In its Mission Statement, REMP highlighted the ‘threat of overpopulation’ for the future prosperity of mankind. Regional unemployment and a falling standard of living could be the results of the disproportional distribution of humanity over the earth. They considered it ‘the imperative duty of scientists and statesmen …to concern themselves with these local disharmonies, by studying them and if possible by indicating solutions to the present difficulties’ (IISG, collectie Beyer, inv.nr. 30). 

 In July 1952 the Research Group organised its first international meeting to discuss its initial plans as expressed in a working paper. As Beyer had been invited by the Americans before the PICMME conference, in return the American Porter Jarrell was invited as an observer on behalf of Pierre Jacobsen, the deputy director of PICMME. He resolutely amended the REMP’s intended research plans, which were primarily focussed on intra-European migration. First he outlined PICMME’s activities: organising migrant transportation via field offices in Germany, Austria, Italy, Trieste and via chartered shipping from the Netherlands to Australia, Brazil, the USA and Venezuela and participating in bilateral agreements. Then he concluded that being so close to the day to day operation, PICMME realised that it might lose sight of the larger demographic implications of its work and therefore looked forward to the scientific studies of the Research Group ‘as possible guideposts for our program in the future’. However, to his view the Research Group was too much confined to intra-European migration only, probably because of ‘a certain distrust of overseas emigration as a cause for the weakening of the basic demographic structure of Europe.’ Jarrell argued that this was not necessary as both sending and receiving states were represented in PICMME. Thus PICMME offered the opportunity to counteract this danger by experimenting with the development of certain flexible controls concerning size, direction and composition of the outflow: ‘If the Research Group were to consider overseas migration it could well recommend to the various governments the nature of the “flexible controls” which might prove scientifically desirable’. The day after Jarrell left the conference Beyer sent him a letter to confirm that the REMP members fully shared Jarrell’s view and would also take overseas emigration into account as an object for study and advice. (all citations IISG, collectie Beyer, inv.nr.31).

It can be concluded not only that the actual founding of REMP and PICMME coincided, but, via the Dutch connection, also that the mutual lobbying and influencing already started during the formative months. The initial discourses concerning labour, population density and demography indicate that the historical actors concerned believed in a makeable welfare society, in which one possible instrument was social engineering via regulating quantity and type of in- and outflow within the respective national populations of the member states. In the Netherlands this was called ‘guided (geleide migratie)’ or ‘planned migration’ (Petersen 1955) and part of the ‘planned modernization’ discourse, as defined by Lutz. However, with Lutz’ remarks on prosopographic evidence in mind, to really substantiate that this expert group in the long run formed a discourse coalition on ‘migration’ and to be able to discover and explain shifts in the presumed networks of actors and discursive elements, we have to establish who were the key scientists involved, and who were the key political actors. How were they connected to the key political actors and institutions mentioned above? And finally: what were the topics of this discourse and how did they develop? In the next section we reflect on the choices of sources and data and on the methodological choices we had to make to start answering these questions.

## The REMP network

REMP was an informal association of social scientists that is much closer to political policy making (both national and international) than was previously known from the literature. This would make it an informal technocratic association, because of the association and the exchange of knowledge between public administrators and scientists. On the one hand this makes administration (in this case migration management) supposedly more rational because it is based on research and scientific insights. But at the same time, this kind of technocratic associations

- make policy making less transparant and less subject to democratic checking point of view and
- science may be used as a justification of politically motivated motives; assuming that the research itself still conforms to scientific standards, it is relevant to compare the choice of subjects with the political agenda.

### Research Questions

In terms of network research:

1. did REMP function as a nexus between public officials and social scientists in the 1950s, and how did this function
2. ICEM was founded in approximately the same time as REMP. They appear to have gotten closer ties from 1961, when the _REMP bulletin_ became part of the _International Migration_ journal
2. was the REMP science-officials nexus carried over and elaborated in ICEM in the 1960s and beyond
3. at what time did the connection cease or transform and was there a relation with the change of ICEM into IOM

Researching the discourse coalition consists of two parts. We first have to make it probable that there were networks that together can be considered to form a discourse coalition because of the involvement of both researchers and admistrators. After the existence of this association is made likely, we can investigate what research content was produced in the networks. Both REMP and ICEM published journals and other research. To see if and how it was influenced by the REMP/ICEM discourse coalition, we will compare it with contemporary research about migration from a journal that was not associated with either REMP or ICEM. 

#### From Concept to Research Strategy

We are trying to establish the existence of a discourse coalition, that is to say associations between scientists and policy makers, who are usually part of administration. The separation between scientists and public administration is by no means absolute Of course, the discourse coalition evolved over time as persons were associated with each other and how. 

A discourse coalition is a concept, an analytical construct that is used to characterize a perceived association between researchers and public administrators based on discursive characteristics. The REMP and ICEM discourse coalition was never a static network, but it evolved over time. Because of this, we have to use different data and datasets for evidence. The evidence for the resource coalition is collected in two different ways:

1. by close reading archival materials in which the association betweeen public administrators and researchers is mentioned, either implicitly or explicitly. 
2. by measuring the association between researchers and public administrators in one or more networks. In this case, we are talking about several networks as migration management was a long running an increasingly international issue that had to face challenges that changed over the years. The connection between the public administrators and the scientists can established if they take part in the activities of the same organisation. In case, the organisations are the REMP and the ICEM and the activities we have evidence of are the publication of research in the form of books, articles and other writings. 

The data are localized in time and in scope that we have outlined in the diagram below (see figure 1)

<img src="../images/diagram_1.png">

#### Data sets

Our research materials consist of both types of materials. The archival materials that lend themselves for close reading mainly stem from the early 1950s (**XXX make explicit: archive Beyer, REMP, sociale zaken?**). ICEM and its successor IOM are international organizations, of which the archives are essentially closed for researchers. As it is, from the REMP archives it is clear that both REMP and ICEM originated at the same time in the early 1950s and in their inception involved many of the same people, even if they were separate organizations, but this is not the same as stating that they were part of a joined 

We have different long-running datasets that make it possible to establish the overlap between the different organizations involved. Because we are talking about a research-adminstrative coalition the datasets are from research and from the administration, that is the relevant boards and governments involved with REMP and ICEM. As researchers publish, covering scientific output is more fine-grained than administrative positions.

Below, we have indicated for the datasets which aspect of the discourse network they contain, including the key persons for the network.

**I** - The ministery of social affairs and mainly the _Rijks Commissaris voor Emigratie_ (RCE, English: _State Commissionary for Emigration_), **Bas Haveman** (but also other politicians and administrators) were very much interested in emigration affairs and in founding policy on scientific data, mainly sociology, statistics and demography. (*N.B. add references to Van Faassen, Emigratie en Polder*)

**II**- ICEM and IOM directors and deputy-directors. **Haveman** was one of the directors (1961-**XXXX**)
Data derive from Van Faassen, _Polder en emigratie_, 126, https://www.iom.int/iom-history (and biographical information over the web?). 

- the ICEM directors were important in international migration management 

**III** - About REMP see above. It was founded in 1952 by **Günter Beijer**. REMP Board membership lists exist from 1952, 1954 and 1969 and a supplement from 1961. The board consisted of partially scientists, partially administrators and politicians, from the Netherlands and abroad. (see spreadsheet sheet REMP personen). This is evidence that these persons were formally associated in the REMP organization in the 1950s until (at least) 1961. A discourse coalition is a loosely connected network, that can be established by the informal contact in a research group. Data derive from the listings in the REMP publications series (*N.B.add title references*).

- REMP board members were important in migration research, mainly in the Netherlands

**IV** - REMP publication series. 

_Studies over NL emigratie_

* Studies in Social Life (1953-1974, only migration studies)
* REMP Bulletins Supplements (1954-1984)

The common research interests and associations between between the public administrators and the scientistsis provided by the several series of publications from REMP. Apart from the authors of the studies themselves, these studies contained persons in different roles. There were writers of prefaces and introductions and people who funded research that all had different contributions. 

The prefaces and introductions were often written by administrators or politicians who also financed the publication or the research or both, but did not contribute to the content. This is not to say that the research was not independent in a scientific sense, but that there was a direct interest in the subject and that the study was considered beneficial for the process of policy formation or evaluation. **Günter Beijer**, who founded REMP, was the editor in chief of the series throughout the 1950s, 1960s and 1970s.

- the titles of the collection and the different roles represented in them provide information about the discourse coalition network of REMP from 1953-1984.

**V** - REMP bulletin authors (1952-1962). REMP bulletin was edited by Günther Beijer from 1952. It started out as an informal newsletter but grew into a more serious publication platform for social scientists. 

- editorialship of Beijer established a network tie with the authors of REMP bulletin

**VI** - In 1961 it merged with the _Migration/Migración_ journal into the ICEM sponsored _Migration Review_, that is edited by Wiley. See below for further explanations

- authors and titles of _Migration Review_ provide indications for who was active in studying migration

**VII** - As a comparison we use the _Internation Migration Review_, that existed from roughly the same time, but was not related to ICEM (see below for further explanations)

- _Internation Migration Review_ is a control dataset for Migration Review

The different datasets all contain information about parts of the technocratic network of scientists and administrators who were involved in what we perceive as evolving discourse coalition about migration and migration management. Each dataset only has data about part of the network and it contains different data with overlaps in people but only partially in time. Together the sets allow for an analysis of the evolving network.

As a consequence, the analysis consists of several parts. 
   - We first establish the existence of a network and a discourse coalition in REMP. This is a network with different roles that we can visualize as a node-edges network, that visualizes the types of relations in the network.
   - Then we establish which REMP members we also active in the ICEM network and which other people played an important role there. As the different datasets contain only a part of the network, they either contain information about the academic or the governance activities. It is not useful to visualize this as a node-edge diagram as many of the relations are not in the dataset. However, the main issue at stake is who was part of which network. Therefore we visualize the network overlap.

## REMP discourse coalition

The REMP dataset makes it possible to establish the relations between researchers and public administrators most detailed as it includes indications of different roles in the publication process. The roles are:

- _article authors_: authors who wrote contributions 
- _preface authors_: authors of the preface to a study
- _introduction authors_: authors of the intro
- _series editors_: the reseachers coordinating the publications
- _executor_: (publishing organization?)
- _funder_: organization that funded the research or publication

The different roles make it possible to establish who contributed to research in different capacities, in this way forging a discourse coalition.

In [1]:
## N.B. delete this cell in final version
import sys
sys.path.append('/data/home/jupyter-jdh-artikel/.local/lib/python3.7/site-packages')
%reload_ext autoreload
%autoreload 2

## Data preparation

In [2]:
from scripts.network_analysis import retrieve_spreadsheet_records

def lowercase_headers(records:list):
    return [dict((k.lower(), v) for k,v in record.items()) for record in records] 

entity_records = retrieve_spreadsheet_records(record_type='entities')
entity_records = lowercase_headers(entity_records)
#print('Number of records:' , len(entity_records))


In [3]:
# index each person entity with their category (academic, technocrat)
def get_entity_name(entity: dict, ):
    name_labels = ['prs_surname', 'prs_infix', 'prs_initials']
    if entity['prs_infix'] != '':
        return f"{entity['prs_surname'].strip()}, {entity['prs_infix'].strip()}, {entity['prs_initials'].strip()}"
    else:
        return f"{entity['prs_surname'].strip()}, {entity['prs_initials'].strip()}"

entity_roles = {get_entity_name(record): [record['prs_role1'], record['prs_role2'], record['prs_role3']] for record in entity_records}
for k in entity_roles:
    nr = [e for e in entity_roles[k] if e !='']
    entity_roles[k]=nr
#entity_roles
entity_category = {get_entity_name(entity = record): record.get('prs_category') or 'unknown' for record in entity_records}

In [4]:
from scripts.network_analysis import retrieve_spreadsheet_records

relationship_records = retrieve_spreadsheet_records(record_type='relationships')
#len(relationship_records)

In [7]:
pd.DataFrame().from_records(relationship_records).columns

Index(['series', 'volume', 'year', 'article_author1_surname',
       'article_author1_infix', 'article_author1_initials',
       'article_author2_surname', 'article_author2_infix',
       'article_author2_initials', 'preface_author1_surname',
       'preface_author1_infix', 'preface_author1_initials',
       'preface_author2_surname', 'preface_author2_infix',
       'preface_author2_initials', 'intro_author1_surname',
       'intro_author1_infix', 'intro_author1_initials',
       'intro_author2_surname', 'intro_author2_infix',
       'intro_author2_initials', 'executor_org', 'funder', 'client',
       'editor_surname', 'editor_infix', 'editor_initials', 'volume_title'],
      dtype='object')

In [8]:
from itertools import chain


In [9]:
er = list(set(chain.from_iterable(entity_roles.values())))


In [10]:
{'vice-chair_BoD':'REMP director',
 'Dep_dir_ICEM':'ICEM',
 'chair_MC':'REMP board',
 'AB':"REMP board",
 'Correspondent':"correspondent",
 'member _MC':"REMP board",
 'chair_BoD':"REMP director",
 'member_BoD':"REMP director",
 'founder':"founder",
 'editor':"editor",
 'member_MC':"REMP board"}

roled = {'author':['article_author1_surname','article_author2_surname'],
 'preface_author':['preface_author1_surname','preface_author2_surname'],
 'intro_author':['intro_author1_surname','intro_author2_surname'],
 'executor':['executor_org'], 
 'funder':['funder'],
 'client':['client'],
 'editor':['editor_surname'],
 'unknown':['']}

In [11]:
categorized_persons = retrieve_spreadsheet_records("categories")
categorized_persons = lowercase_headers(categorized_persons)

In [61]:
cat_p_df = pd.DataFrame(categorized_persons)

In [13]:
from collections import defaultdict, Counter # XXX imports to top?

from scripts.network_analysis import extract_record_entities

def get_entity_category(entity: dict):
    if entity.get('entity_name') in entity_category:
        return entity_category[entity['entity_name']] 
    else:
        return 'unknown'


In [64]:
cat_p_df['fullname'] = cat_p_df.apply(lambda row: get_entity_name(entity=row), axis=1)
cat_p_df

Unnamed: 0,organisation,period_start,last_known_date,prs_id,prs_surname,prs_infix,prs_initials,prs_function,prs_category,is_academic,is_public_administration,sources,prs_country,prs_role1,prs_role2,prs_role3,remarks,fullname
0,REMP,1952,1983,1,Beijer,,G.,"demographer, The Hague",academic,yes,,,NL,founder,member_MC,secretary-editor,director-editor (1969),"Beijer, G."
1,REMP,1952,1969,2,Groenman,,Sj.,"sociologist, Leiden",academic,1947,1943-1950,https://nl.wikipedia.org/wiki/Sjoerd_Groenman ...,NL,founder,member_MC,vice-chair_BoD,,"Groenman, Sj."
2,REMP,1952,1969,3,Zeegers,,G.H.L.,"economist, sociologist, Nijmegen",academic,yes,1941-1950,https://www.ru.nl/kdc/bladeren/archieven-thema...,NL,founder,member_MC,member_BoD,,"Zeegers, G.H.L."
3,REMP,1952,1969,4,Hofstee,,E.W.,"sociologist, Wageningen",academic,yes,"yes, advisor 5 ministeries",http://resources.huygens.knaw.nl/bwn1880-2000/...,NL,founder,member_BoD,,,"Hofstee, E.W."
4,REMP,1952,1969,5,Bouman,,P.J.,"sociologist, Groningen",academic,yes,,"https://nl.wikipedia.org/wiki/P.J._Bouman, htt...",NL,member_BoD,,chair_BoD (1954),,"Bouman, P.J."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,ICEM,1970,1988,68,Maselli,,G.,deputy director general,,,,,IT,,,,,"Maselli, G."
70,ICEM,1989,1993,69,Charry-Samper,,H.,deputy director general,,,,,CO,,,,,"Charry-Samper, H."
71,ICEM,1994,1999,70,Escaler,,N.L. (Narcisa),deputy director general,,,,,PH,,,,,"Escaler, N.L. (Narcisa)"
72,ICEM,1999,2009,71,Ndioro,,N. (Ndiaye),deputy director general,,,,,SN,,,,,"Ndioro, N. (Ndiaye)"


In [67]:
nms = [get_entity_name(entity=record) for record in entity_records]
remp_entities = cat_p_df.loc[cat_p_df.fullname.isin(nms)]


Index(['organisation', 'period_start', 'last_known_date', 'prs_id',
       'prs_surname', 'prs_infix', 'prs_initials', 'prs_function',
       'prs_category', 'is_academic', 'is_public_administration', 'sources',
       'prs_country', 'prs_role1', 'prs_role2', 'prs_role3', 'remarks',
       'fullname'],
      dtype='object')

In [14]:
record_entities = defaultdict(list)
entity_count = Counter()
entity_role_count = Counter()
for ri, record in enumerate(relationship_records):
    entities = extract_record_entities(record)
    record_entities[ri].append(entities)
    entity_count.update([entity['entity_name'] for entity in entities if 'entity_name' in entity])
    entity_role_count.update([entity['entity_role'] + ' ' + entity['entity_name'] for entity in entities if 'entity_name' in entity])
    for entity in entities:
        if entity['entity_type'] == 'person':
            entity['entity_type'] = get_entity_category(entity)
        # print(entity)
    # print(ri)
    
    

In [15]:
def make_nodes(entity):
    return [n.get('entity_name') for n in entity if n.get('entity_name')]
    
def make_link_from_entity(entity, revnodelist):
    counter = []
    authors = [n.get('entity_name') for n in entity if n.get('entity_role')=='article_author']
    links = []
    for aut in authors:
        autnr = revnodelist[aut]
        counter.append(autnr)
        for node in entity:
            if node.get('entity_role') != 'article_author':
                if node.get('entity_role'): # we don't include titles
                    category = node.get('entity_role') or "unknown"
                    target = revnodelist[node.get('entity_name')]
                    graphnode = rempgraph.nodes()[target]
                    if graphnode.get('category'):
                        graphnode['category'].append(category)
                    else:
                        graphnode['category'] = [category]
                    link = (autnr, target, {"link_type": node.get('entity_role') or 'unknown'})
                    links.append(link)
                    counter.append(target)
    
    return links, counter

In [16]:
nodelist = []
for rentity in record_entities:
    nodelist.extend(make_nodes(record_entities[rentity][0]))
nodelist = list(set(nodelist))
revnodelist = {}
rempgraph = nx.Graph()
for node in enumerate(nodelist):
    rempgraph.add_node(node[0], id=node[0], name=node[1])
    revnodelist[node[1]]=node[0]

In [17]:
linklist = []
counter = Counter()
for rentity in record_entities:
    links, cntr = make_link_from_entity(record_entities[rentity][0], revnodelist)
    linklist.extend(links)
    counter.update(cntr)

In [84]:
communities = {}
for i in cat_p_df.organisation.unique():
     communities[i] = list(cat_p_df.loc[cat_p_df.organisation==i].fullname)
communities

{'REMP': ['Beijer, G.',
  'Groenman, Sj.',
  'Zeegers, G.H.L.',
  'Hofstee, E.W.',
  'Bouman, P.J.',
  'Oldendorff, A.',
  'Gelissen, H.',
  'Schokking, J.J.',
  'Sauvy, A.',
  'Gottmann, J.',
  'Lacroix, M.',
  'Jacobson, P.',
  'Winkler, W.',
  'Janne, H.',
  'Mertens de Wilmars, J.',
  'Baade, F.',
  'Mackenroth, G.',
  'Ritschl, H.',
  'Hoffmann, W.',
  'Neundorffer, L.',
  'Gadolin, de, A.',
  'Vito, F.',
  'Livi, L.',
  'Parenti, G.',
  'Vergottini, M.',
  'Vampa, D',
  'Hyrenius, H.',
  'Salin, E.W.',
  'Rappard, W.E.',
  'Nixon, J.W.',
  'Isaac, J.',
  'Marshall, M.H.',
  'Oudenhove, van, C.M.',
  'Kulischer, E.',
  'Brink, van den, T.',
  'Clark, C.',
  'Hofsten, von, E.A.',
  'Backer, J.',
  'Grot, J.',
  'Goudswaard, G.',
  'Oblath, A.',
  'Fougstedt, G.',
  'Thomas, B.',
  'Edding, F.',
  'Kant, E.',
  'Weinberg, A.A.',
  'Roof, M.',
  'Bastos de Avila, F.',
  'Oudegeest, J.J.',
  'Appleyard, R.T.',
  'Borrie, W.'],
 'Dutch Government': ['Haveman, B.W.',
  'Hofstede, B.P.',

In [85]:
#see https://stackoverflow.com/questions/43541376/how-to-draw-communities-with-networkx (answer by Paul Brodersen)
# N.B. how do we cite this???

import numpy as np
import matplotlib.pyplot as plt
import networkx as nx

def community_layout(g, partition):
    """
    Compute the layout for a modular graph.


    Arguments:
    ----------
    g -- networkx.Graph or networkx.DiGraph instance
        graph to plot

    partition -- dict mapping int node -> int community
        graph partitions


    Returns:
    --------
    pos -- dict mapping int node -> (float x, float y)
        node positions

    """

    pos_communities = _position_communities(g, partition, scale=3.)

    pos_nodes = _position_nodes(g, partition, scale=1.)

    # combine positions
    pos = dict()
    for node in g.nodes():
        pos[node] = pos_communities[node] + pos_nodes[node]

    return pos

def _position_communities(g, partition, **kwargs):

    # create a weighted graph, in which each node corresponds to a community,
    # and each edge weight to the number of edges between communities
    between_community_edges = _find_between_community_edges(g, partition)

    communities = set(partition.values())
    hypergraph = nx.DiGraph()
    hypergraph.add_nodes_from(communities)
    for (ci, cj), edges in between_community_edges.items():
        hypergraph.add_edge(ci, cj, weight=len(edges))

    # find layout for communities
    pos_communities = nx.spring_layout(hypergraph, **kwargs)

    # set node positions to position of community
    pos = dict()
    for node, community in partition.items():
        pos[node] = pos_communities[community]

    return pos

def _find_between_community_edges(g, partition):

    edges = dict()

    for (ni, nj) in g.edges():
        ci = partition[ni]
        cj = partition[nj]

        if ci != cj:
            try:
                edges[(ci, cj)] += [(ni, nj)]
            except KeyError:
                edges[(ci, cj)] = [(ni, nj)]

    return edges

def _position_nodes(g, partition, **kwargs):
    """
    Positions nodes within communities.
    """

    communities = dict()
    for node, community in partition.items():
        try:
            communities[community] += [node]
        except KeyError:
            communities[community] = [node]

    pos = dict()
    for ci, nodes in communities.items():
        subgraph = g.subgraph(nodes)
        pos_subgraph = nx.spring_layout(subgraph, **kwargs)
        pos.update(pos_subgraph)

    return pos

def test():
    #to install networkx 2.0 compatible version of python-louvain use:
     
    from community import community_louvain


    partition = community_louvain.best_partition(graph)
    pos = community_layout(graph, partition)

    nx.draw(graph, pos, node_color=list(partition.values())); plt.show()
    return



In [86]:
from scripts.network_analysis import generate_graph, add_entities, add_record_links
from community import community_louvain


periods = {
    1950:{'start': 1950, 'end': 1959}, 
    1960:{'start': 1960, 'end': 1969},
    1970:{'start': 1970, 'end': 1985},
     # {'start': 1980, 'end': 1989},
}

grphdict = {}

for period in periods:
    period = periods[period]
    pn = period['start']
    periodgraph = generate_graph()
    for record in relationship_records:
        record['year'] = int(record['year'])
    for ri, record in enumerate(sorted(relationship_records, key = lambda x: x['year'])):
        record['year'] = int(record['year'])
        if record['year'] < period['start'] or record['year'] > period['end']:
            continue
        #print(record['year'])
        entities = extract_record_entities(record)
        for entity in entities:
            if entity['entity_type'] == 'person':
                entity['entity_type'] = get_entity_category(entity)
        named_entities = [entity for entity in entities if 'entity_name' in entity]
        #print([entity['entity_name'] for entity in named_entities])
        add_entities(periodgraph, named_entities)
        add_record_links(periodgraph, named_entities)
        # if ri == 100:
        #     break
    grphdict[pn] = periodgraph


adding link REMP Wander, H.
adding link REMP Bouman, P.J.
adding link REMP Beijer, G.
adding link Wander, H. Bouman, P.J.
adding link Wander, H. Beijer, G.
adding link Bouman, P.J. Beijer, G.
adding link REMP Citroen, H.A.
adding link REMP Groenman, Sj.
adding link REMP Rappard, W.E.
adding link REMP Beijer, G.
adding link Citroen, H.A. Groenman, Sj.
adding link Citroen, H.A. Rappard, W.E.
adding link Citroen, H.A. Beijer, G.
adding link Groenman, Sj. Rappard, W.E.
adding link Groenman, Sj. Beijer, G.
adding link Rappard, W.E. Beijer, G.
adding link REMP Edding, F.
adding link REMP Salin, E.
adding link REMP Beijer, G.
adding link Edding, F. Salin, E.
adding link Edding, F. Beijer, G.
adding link Salin, E. Beijer, G.
adding link REMP Beijer, G.
adding link REMP Oudegeest, J.J.
adding link REMP Sauvy, A.
adding link REMP Beijer, G.
adding link Beijer, G. Oudegeest, J.J.
adding link Beijer, G. Sauvy, A.
adding link Beijer, G. Beijer, G.
adding link Oudegeest, J.J. Sauvy, A.
adding link O

In [87]:
periodcentralities = {}
nodelists = {}
for pn in grphdict:
    periodcentralities[pn] = nx.eigenvector_centrality(grphdict[pn])
    nodelists[pn] = [n for n in grphdict[pn].nodes()]

In [88]:
from itertools import product, combinations, islice
commonnames = {}

#list(combinations(list(nodelists.keys()), r=3))
# list(islice(list(nodelists.keys()), 3))
window_size = 2

seq = list(nodelists.keys())
for i in range(len(seq) - window_size + 1):
    w = seq[i: i + window_size]
#     {e:set(nodelists[e]) for e in }
    for item in w:
        if item not in commonnames.keys():
            commonnames[item] = list(set(nodelists[w[0]]) & set(nodelists[w[1]]))


In [89]:
for i in commonnames:
    commonnames[i].append('Haveman, B.W.')

In [92]:
#this method factors out commonalities for the graphs below
def period_graph(pn, grphdict=grphdict, periodcentralities=periodcentralities, commonnames=commonnames):
    periodgraph = grphdict[pn]

    for f in periodgraph.nodes():
                # comty = periodgraph.nodes[f].get('community')
                # comty = ''.join([c[0] for c in comty])
                if f in commonnames[pn]:
                    label = f
                else:
                    label = ''
                periodgraph.nodes[f].update({#"community" : comty,
                                    # "edgecolor": colors.get(comty) or 'purple',
                                    "centrality" : periodcentralities[pn][f]*4,
                                    "name" : f,
                                    "label" : label
                                   })

    for f in periodgraph.edges():
        for i, c in communities.items():
             if f[0] and f[1] in c:
                comty = periodgraph.edges[f].get('community') or ''
                comty = ', '.join([comty,i])
                # comty = ''.join([c[0] for c in comty])
                periodgraph.edges[f].update({"community" : comty,
                                    })

    for f in periodgraph.edges():
        comty = periodgraph.edges[f].get('community') or []
        comty = ''.join([c[0] for c in comty])
        periodgraph.edges[f].update({"community" : comty})
    partition = community_louvain.best_partition(periodgraph)
    pos = community_layout(periodgraph, partition)
    
    return periodgraph, pos
    

In [93]:
import altair as alt

chartdict = {}
for period in enumerate(grphdict.keys()):
    n = period[0]
    pn = period[1]
    periodgraph, pos = period_graph(pn)

    chart = nxa.draw_networkx(
            G=periodgraph,
            pos=pos,
            node_size='centrality',
            node_color='entity_type',
            edge_color='community',
            cmap='accent',
            #edge_cmap='category10',
            node_tooltip=['name'],
            node_label='label',
            font_color="black",
            font_size=11,
        )
    start = periods[pn]['start']
    end = periods[pn]['end']
    chart.title = f"REMP network {start}-{end}"
    #chart.configure_view(width=800, height=600,)
    chartdict[n] = chart
    


In [94]:
vconcat = alt.vconcat(chartdict[0], chartdict[1], chartdict[2])

vconcat.configure_view(continuousHeight=1000, continuousWidth=800)
vconcat

In [95]:
import hvplot.networkx as hvnx
import holoviews as hv

In [96]:
chartdict2 = {}
charts=None
for period in enumerate(grphdict.keys()):
    n = period[0]
    pn = period[1]
    periodgraph, pos = period_graph(pn)
    labels = {node: node for node in periodgraph.nodes() if node in commonnames[pn]}
    start = periods[pn]['start']
    end = periods[pn]['end']
    chart.title = f"REMP network {start}-{end}"
    chart = hvnx.draw(G=periodgraph, 
                      pos=pos, 
                      with_labels=True, 
                      labels=labels,
                      node_size=hv.dim('centrality')*200,
                      node_color='entity_type',
                      edge_color='community',
                      cmap='accent',
                      edge_cmap='category20',
                      node_tooltip=['name'],
                      #node_label='name',
                      font_color="black",
                      #font_size='11',
                      width=800,
                      height=600,
                      )

    #chart.configure_view(width=800, height=600,)
    chart.Overlay.opts(title=f"REMP network {start}-{end}")
    if not charts:
        charts = chart
    else:
        charts = charts + chart

In [97]:
# N.B. this is an alternative visualisation

charts.Overlay.opts(title="REMP networks 1950s-1970s")
charts

In [98]:
relrecs = pd.DataFrame(relationship_records)
for c in ['year']:
    relrecs[c] = relrecs[c].astype('int')
relrecs.head()

Unnamed: 0,series,volume,year,article_author1_surname,article_author1_infix,article_author1_initials,article_author2_surname,article_author2_infix,article_author2_initials,preface_author1_surname,...,intro_author2_surname,intro_author2_infix,intro_author2_initials,executor_org,funder,client,editor_surname,editor_infix,editor_initials,volume_title
0,Studies over Nederlandse emigratie,1,1958,Hofstede,,B.P.,,,,Groenman,...,,,,RCE,ICEM,RCE,,,,De gaande man : gronden van de emigratiebeslis...
1,Studies over Nederlandse emigratie,2,1960,Frijda,,N.H.,,,,Haveman,...,,,,REMP,ICEM,RCE,,,,"Emigranten, niet-emigranten : kwantitatieve an..."
2,Studies over Nederlandse emigratie,3,1961,Wentholt,,R.,,,,Haveman,...,,,,REMP,ICEM,RCE,,,,Kenmerken van de Nederlandse emigrant : een an...
3,Studies over Nederlandse emigratie,4,1962,Frijda,,N.H.,,,,,...,,,,REMP,,RCE,,,,Emigranten overzee : resultaten van een eerste...
4,Publications of the research group for europea...,1,1951,Wander,,H.,,,,Bouman,...,,,,REMP,,,Beijer,,G.,The importance of emigration for the solution ...


In [99]:
def aut_to_fn(cols):
    if cols[0].strip() != '':
         return f'{cols[0]+","} {cols[1]} {cols[2] or ""}'.strip()

In [100]:
namecolumns = ['{author}_surname','{author}_infix', '{author}_initials']
t_authors = ['article_author1',
             'article_author2',
             'preface_author1',
             'preface_author2',
             'intro_author1',
             'intro_author2',
             'editor']
for a in t_authors:
    clst = [c.format(author=a) for c in namecolumns]
    colnm = '{a}'.format(a=a)
    relrecs[colnm] = relrecs[clst].apply(lambda x: aut_to_fn(x), axis=1)

In [101]:
keepcolumns = ['series', 'volume', 'volume_title', 'year', 'funder', 'client', 'executor_org'] + t_authors

In [102]:
cleanrecs = relrecs[keepcolumns].fillna('')

In [103]:
period_results = defaultdict(dict)
for period in periods:
    recs = cleanrecs.loc[relrecs.year.isin(range(periods[period]['start'],periods[period]['end']))]
    recnrs = len(recs)
    period_results[period]['nr of titles'] = recnrs
    relationfields = t_authors + ['executor_org','funder', 'client']
    for c in relationfields:
        period_results[period][c] = len(recs[c].unique())
    

In [104]:
nodes = {'authors':['article_author1','article_author2'],
'pref_a' : ['preface_author1', 'preface_author2',],
'intro_a' : ['intro_author1', 'intro_author2'],
'editor' : ['editor'],
'funder' : ['funder'],
'executor_org':['executor_org']}

In [105]:
overall_results = {}
for c in t_authors + ['funder', 'client', 'executor_org']:
    overall_results[c] = list(relrecs[c].unique())


In [125]:
cnted = {}
for key in nodes:
    allaut = Counter()
    cnted[key] = Counter()
    for f in nodes[key]:
        cnted[key].update(cleanrecs[f].value_counts().to_dict())
overview = pd.DataFrame(cnted).fillna(0)
overview.drop(index='', inplace=True)
overview['total'] = overview.agg('sum', axis=1)

overview.sort_values(by='total', ascending=False).loc[overview.total>2]

Unnamed: 0,authors,pref_a,intro_a,editor,funder,executor_org,total
"Beijer, G.",12.0,0.0,0.0,139.0,0.0,0.0,151.0
REMP,0.0,0.0,0.0,0.0,0.0,24.0,24.0
"Groenman, Sj.",2.0,2.0,1.0,0.0,0.0,0.0,5.0
"Hofstee, E.W.",4.0,1.0,0.0,0.0,0.0,0.0,5.0
ICEM,0.0,0.0,0.0,0.0,4.0,0.0,4.0
"Edding, F.",4.0,0.0,0.0,0.0,0.0,0.0,4.0
"Mol, J.J.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Radspieler, T.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Zubrzycki, J.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Sauvy, A.",1.0,2.0,0.0,0.0,0.0,0.0,3.0


In [126]:
# count the number of contributions in the network
overview.total.value_counts()


1.0      103
2.0       18
3.0        8
5.0        2
4.0        2
24.0       1
151.0      1
Name: total, dtype: int64

In [127]:
overview.total.sum()

356.0

In [128]:
freq_auts = overview.loc[overview.authors > 1]
len(freq_auts)

25

In [154]:
freq_auts

Unnamed: 0,authors,pref_a,intro_a,editor,funder,executor_org,total
"Beijer, G.",12.0,0.0,0.0,139.0,0.0,0.0,151.0
"Edding, F.",4.0,0.0,0.0,0.0,0.0,0.0,4.0
"Hofstee, E.W.",4.0,1.0,0.0,0.0,0.0,0.0,5.0
"Richardson, A.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Hack, H.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Zubrzycki, J.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Mol, J.J.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Radspieler, T.",3.0,0.0,0.0,0.0,0.0,0.0,3.0
"Timlin, M.F.",2.0,0.0,0.0,0.0,0.0,0.0,2.0
"Wentholt, R.",2.0,0.0,0.0,0.0,0.0,0.0,2.0


In [130]:
for key in grphdict.keys():
    print(key, len(grphdict[key].edges))

1950 170
1960 64
1970 31


In [133]:
auts = overview.loc[overview.authors > 0]

In [134]:
aut_category = {}
for aut in auts.index:
    n = ', '.join(aut.split(',  '))
    aut_category[n] = entity_category.get(n) or 'unknown'
    
autcat = pd.DataFrame().from_dict(aut_category, orient="index")

In [136]:
autlst = list([', '.join(aut.split(',  ')) for aut in auts.index])

In [141]:
cat_p_df.loc[cat_p_df.fullname.isin(autlst)][['fullname','prs_category']]

Unnamed: 0,fullname,prs_category
0,"Beijer, G.",academic
1,"Groenman, Sj.",academic
2,"Zeegers, G.H.L.",academic
3,"Hofstee, E.W.",academic
8,"Sauvy, A.",academic
9,"Gottmann, J.",academic
16,"Mackenroth, G.",academic
26,"Hyrenius, H.",academic
29,"Nixon, J.W.",unknown
30,"Isaac, J.",academic


In [152]:
cat_p_df.loc[(cat_p_df.fullname.isin(autlst)) & 
            (cat_p_df.is_public_administration != '') & 
            (cat_p_df.is_academic != '')][['fullname', 'prs_category', 
                                             'is_public_administration', 
                                             'is_academic'
                                            ]]

Unnamed: 0,fullname,prs_category,is_public_administration,is_academic
1,"Groenman, Sj.",academic,1943-1950,1947
2,"Zeegers, G.H.L.",academic,1941-1950,yes
3,"Hofstee, E.W.",academic,"yes, advisor 5 ministeries",yes
8,"Sauvy, A.",academic,1939,1945
9,"Gottmann, J.",academic,1945,yes
16,"Mackenroth, G.",academic,1954-1955,yes
43,"Edding, F.",academic,1935-1943,yes
52,"Hofstede, B.P.",,yes,1964
53,"Verwey-Jonker, H.",academic,yes,1973


In [144]:
cat_p_df.loc[cat_p_df.fullname.isin(autlst)][['fullname','prs_country']].prs_country.value_counts()

NL    7
FR    2
DE    2
AU    1
UK    1
IT    1
SE    1
CH    1
IL    1
BR    1
Name: prs_country, dtype: int64

In [60]:
# cat_p_df = pd.DataFrame(categorized_persons)
# cat_p_df['fullname'] = cat_p_df.apply(lambda row: get_entity_name(entity=row), axis=1) #
country_person = {get_entity_name(entity=record):record.get('prs_country') for record in categorized_persons }

In [48]:
entity_nationality = {get_entity_name(entity = record):record.get('prs_country') or 'unknown' for record in entity_records}


In [153]:
for record in entity_records:
    nm = get_entity_name(entity = record)
    p.get('prs_country') for p in country_person or 'unknown' for record in entity_records}

SyntaxError: invalid syntax (<ipython-input-153-338f9f68f825>, line 3)

In [None]:
def get_entity_country(entity: dict):
    if entity.get('entity_name') in entity_category:
        return entity_category[entity['entity_name']] 
    else:
        return 'unknown'


    

In [49]:
aut_country = {}
for aut in auts.index:
    n = ', '.join(aut.split(',  '))
    aut_country[n] = entity_nationality.get(n) or 'unknown'
autcountry = pd.DataFrame().from_dict(aut_country, orient="index")

In [50]:
autcountry.loc[autcountry[0]!='unknown'].value_counts()

NL    5
DE    2
FR    2
AU    1
CH    1
SE    1
UK    1
dtype: int64

In [527]:
sups = overview.loc[(overview.funder>0)|(overview.pref_a>0)].sort_values(by='total', ascending=False)
sups

Unnamed: 0,authors,pref_a,intro_a,editor,funder,executor_org,total
"Hofstee, E.W.",4.0,1.0,0.0,0.0,0.0,0.0,5.0
"Groenman, Sj.",2.0,2.0,1.0,0.0,0.0,0.0,5.0
ICEM,0.0,0.0,0.0,0.0,4.0,0.0,4.0
"Sauvy, A.",1.0,2.0,0.0,0.0,0.0,0.0,3.0
"Zeegers, G.H.L.",1.0,1.0,1.0,0.0,0.0,0.0,3.0
"Isaac, J.",2.0,1.0,0.0,0.0,0.0,0.0,3.0
"Ipsen, G.",1.0,1.0,0.0,0.0,0.0,0.0,2.0
"Mill, van A.N.",1.0,1.0,0.0,0.0,0.0,0.0,2.0
"Haveman, B.W.",0.0,2.0,0.0,0.0,0.0,0.0,2.0
"Eidt, R.C.",0.0,1.0,0.0,0.0,0.0,0.0,1.0


In [528]:
sup_country = {}
for sup in sups.index:
    n = ', '.join(sup.split(',  '))
    sup_country[n] = entity_nationality.get(n) or 'unknown'

In [529]:
[x for x in entity_nationality if x[0]=='V']

['Vito, F.', 'Vergottini, M.', 'Vampa, D']

In [530]:
supcountry = pd.DataFrame().from_dict(sup_country, orient='index')

In [531]:
supcountry.sort_values(by=0)

Unnamed: 0,0
"Rappard, W.E.",CH
"Sauvy, A.",FR
"Bouman, P.J.",NL
"Zeegers, G.H.L.",NL
"Hofstee, E.W.",NL
"Groenman, Sj.",NL
"Isaac, J.",UK
"Visser t Hooft, W.A.",unknown
"Beveridge, W.H.",unknown
"Salin, E.",unknown


In [532]:
supcountry.loc[supcountry[0]!='unknown'].value_counts()

NL    4
CH    1
FR    1
UK    1
dtype: int64

In [533]:
overview.loc[(overview.authors >0) & (overview.total > overview.authors)]

Unnamed: 0,authors,pref_a,intro_a,editor,funder,executor_org,total
"Beijer, G.",12.0,0.0,0.0,139.0,0.0,0.0,151.0
"Hofstee, E.W.",4.0,1.0,0.0,0.0,0.0,0.0,5.0
"Isaac, J.",2.0,1.0,0.0,0.0,0.0,0.0,3.0
"Ipsen, G.",1.0,1.0,0.0,0.0,0.0,0.0,2.0
"Sauvy, A.",1.0,2.0,0.0,0.0,0.0,0.0,3.0
"Mill, van A.N.",1.0,1.0,0.0,0.0,0.0,0.0,2.0
"Groenman, Sj.",2.0,2.0,1.0,0.0,0.0,0.0,5.0
"Zeegers, G.H.L.",1.0,1.0,1.0,0.0,0.0,0.0,3.0


In [534]:
subgraphs = {}
for sg in grphdict:
    subgraphs[sg] =[]
    for node in grphdict[sg].nodes():
        n = entity_nationality.get(node)
        if n:
            subgraphs[sg].append({node: n})

In [535]:
subgraphs

{1950: [{'Bouman, P.J.': 'NL'},
  {'Beijer, G.': 'NL'},
  {'Groenman, Sj.': 'NL'},
  {'Rappard, W.E.': 'CH'},
  {'Edding, F.': 'DE'},
  {'Oudegeest, J.J.': 'NL'},
  {'Sauvy, A.': 'FR'},
  {'Gadolin, de, A.': 'FI'},
  {'Zeegers, G.H.L.': 'NL'},
  {'Hofstee, E.W.': 'NL'},
  {'Mackenroth, G.': 'DE'},
  {'Nixon, J.W.': 'CH'},
  {'Isaac, J.': 'UK'},
  {'Hyrenius, H.': 'SE'},
  {'Appleyard, R.T.': 'AU'},
  {'Gottmann, J.': 'FR'}],
 1960: [{'Beijer, G.': 'NL'}, {'Nixon, J.W.': 'CH'}, {'Groenman, Sj.': 'NL'}],
 1970: [{'Beijer, G.': 'NL'}]}

For the construction of the REMP network as shown in figure 1, we used the data from the REMP publications (dataset IV), as it contains most detailed data about associations between people. We divided it into three periods: the 1950s, 1960s and 1970s. The 1970s also include the few publications that were published in the 1980s. The heydeys of REMP were in the 1950s, when there were 87 titles in the publications, against 26 in the 1960 and 12 in the 1970s. This is mirrored in the different networks involved, that counted 170 connections (edges) between actors in the 1950s, 62 in the 1960s and 31 in the 1970s.

Over the whole network there were in total 134 different contributors to the volumes - 113 article authors, 23 preface authors, 6 introduction_authors, 2 editors and 4 funders. Most (102) made only a single contribution of any sort, 18 2 and only 13 who contributed 3 or more times. 

Most authors also wrote or co-wrote a single article; 26 authors contributed to more than once. For most authors, we do not know much, but all known authors had an academic status. Of the authors 8 were active in other roles. REMP authors were not just from the Netherland. For the 13 authors whose nationality is known, 5 were from the Netherlands, 2 from Germany and France each and 1 each from Australia, Swiss, Sweden and the United Kingdom.

Some authors also fulfilled other roles in the network. Günter Beijer was the editor of the whole series and 7 wrote a preface for works in which they were not involved as authors. Groenman and Zeegers held three different roles, as they also wrote an introduction.

There were 25 persons or institutions who either wrote prefaces or funded the REMP publications. ICEM was the largest supporter with 4 titles, there were another 4 financers from the Netherlands and 1 each from Swiss, France and the UK.

This description  partly also visualised in the graphs (figure 2). There are a few important conclusions that follow from this analysis. 

1. Günter Beijer was the founder of the REMP and throughout the 1950s, 60 and 70s the central person for its scientific output. His network consisted of academics and administrators that collaborated on a number of studies.
2. The people and organisations in the network were not only from the Netherlands, but also a number of other countries. 
3. There were numerous connections between politicians and academics that together established a discourse coalition, in which the academics (mostly) wrote the contributions while administrators and their organisations supported the research by funding, by commissioning studies and by writing prefaces that pointed out the importance for policy formation.
4. The international organisation ICEM was involved explicitly in the network from 1958 when it started to fund studies in emigration. This was mediated through Bas Haveman, who in 1961 became director of ICEM. 

**N.B. some remarks and omissions** I have tried to write the paragraph above on basis of the  spreadsheets REMP title spreadsheet, but the spreadsheet need more work. For a number of people we have no country, even if it is in other tabs. In this way M.Klompe has no country, and there are a few more.
I constructed communities for the people in the graph mostly from the personencategorieen tab, but not quite. 

Altogether we have to rethink what goes where in the sheets and what we want to repeat, because as it is it is not complete enough. This is not really apparent in the graph, but the description I made above feels a bit inaccurate.


### Network Overlap

From the time Haveman took office at ICEM, he involved the REMP network in its operations, striving to found policies on research. This was evident when the previously separated journals _Migration_ and the _REMP Bulletin_ were merged into a new journal, _International Migration_. But also the network itself was extended. The REMP publications dataset is not suited to investigate this. Therefore, we have compared the evidence we have in other datasets for overlap. These datasets show in the diagram in figure 1. 

More specifically, we study the overlap between members of the REMP Board, the ICEM directors and deputy directors, the Dutch Government and the authors of the studies in REMP, IM and IMR. 


In [157]:
import pandas as pd

records_file = '../data/main-review-article-records.csv'

# load the csv data into a data frame
pub_df = pd.read_csv(records_file)


In [161]:
from scripts.data_wrangling import map_dataset

pub_df['dataset'] = pub_df.apply(lambda x: map_dataset(x['publisher'], x['article_type']), axis=1)
vals = pub_df.dataset.value_counts().to_frame()

Unnamed: 0,dataset
IMR_review,1842
IMR_research,1678
REMP_IM,762


#### Clustering Author Names



We want to see which authors published in both journals, and how often. This requires a number of transformations:

1. splitting records of multi-author papers into a record per author
2. normalising author names such that variant spellings are mapped to a single version. 

The latter step is always a risky operation, because using only the surface form of a name can results in two persons with similar names being considered as a single person. Given that this dataset narrowly focuses in only authors of articles in the two journals, we assume the chance that two authors have the same surname and initials is low. 




##### Splitting multi-author records

In [363]:
# Code adapted from https://stackoverflow.com/questions/50731229/split-cell-into-multiple-rows-in-pandas-dataframe

import numpy as np
from itertools import chain

# return list from series of comma-separated strings
def chainer(s):
    return list(chain.from_iterable(s.fillna('').str.split(' && ')))

# calculate lengths of splits
lens = pub_df['article_author'].fillna('').str.split(' && ').map(len)

# create new dataframe, repeating or chaining as appropriate
split_pub_df = pd.DataFrame({
    'journal': np.repeat(pub_df['journal'], lens),
    'issue_pub_year': np.repeat(pub_df['issue_pub_year'], lens),
    'publisher': np.repeat(pub_df['publisher'], lens),
    'dataset': np.repeat(pub_df['dataset'], lens),
    'article_author': chainer(pub_df['article_author']),
    'article_author_index_name': chainer(pub_df['article_author_index_name']),
    'article_author_affiliation': chainer(pub_df['article_author_affiliation'])
})

split_pub_df = split_pub_df.reset_index(drop=True)


##### Normalising author names



There is a lot of variation in how author names are represented. Sometimes with full first and middle names, sometime with only the first name or only initials, or the first name in full but the middle names as initials.

We start from the author format where the surname is followed by the first and middle names (field `article_author_index_name`). We apply the following normalisation and mapping steps:

1. transform the `article_author_index_name` to title casing (meaning each initial character of a name part is uppercase and the rest is lowercase),
2. remove everything after the first letter that follows the surname,
3. transform all uses of `ij` to `y` as this Dutch and German names containing `ij` are sometimes spelled with `y`, e.g. `Gunther Beijer` vs. `Gunther Beyer`.


In [364]:
from scripts.data_wrangling import parse_surname, parse_surname_initial, acronym

# Make sure title case is used consistently in the author index name column
split_pub_df['article_author_index_name'] = split_pub_df['article_author_index_name'].str.title()
# add a column with surname and first name initial extracted from the author index name
split_pub_df['author_surname_initial'] = split_pub_df.article_author_index_name.apply(parse_surname_initial)
# add a column with surname only
split_pub_df['author_surname'] = split_pub_df.article_author_index_name.apply(parse_surname)
# add a column with the decade in which the issue was published that contains an article
split_pub_df['issue_pub_decade'] = split_pub_df.issue_pub_year.apply(lambda x: int(x/10)*10)
# map journal names to their acronyms
split_pub_df.journal = split_pub_df.journal.apply(acronym)

# remove articles with no authors
split_pub_df =  split_pub_df[split_pub_df.article_author != '']


#### Parsing Organisational Membership Records



We consider the REMP and ICEM as semi-political organisations. Some members of the Dutch government are closely collaborating with REMP and ICEM.


In [365]:
from scripts.network_analysis import retrieve_spreadsheet_records

entity_records = retrieve_spreadsheet_records(record_type='categories')
print('Number of records:' , len(entity_records))


Number of records: 74


In [367]:
from scripts.data_wrangling import parse_author_index_name

board_df = pd.DataFrame(entity_records)
b_cols = {c:c.lower() for c in board_df.columns}
board_df.rename(columns=b_cols, inplace=True)
board_df['article_author_index_name'] = board_df.apply(parse_author_index_name, axis=1)
board_df['author_surname_initial'] = board_df.article_author_index_name.apply(parse_surname_initial)
board_df

Unnamed: 0,organisation,period_start,last_known_date,prs_id,prs_surname,prs_infix,prs_initials,prs_function,prs_category,is_academic,is_public_administration,sources,prs_country,prs_role1,prs_role2,prs_role3,remarks,article_author_index_name,author_surname_initial
0,REMP,1952,1983,1,Beijer,,G.,"demographer, The Hague",academic,yes,,,NL,founder,member_MC,secretary-editor,director-editor (1969),"Beijer, G.","Beyer, G"
1,REMP,1952,1969,2,Groenman,,Sj.,"sociologist, Leiden",academic,1947,1943-1950,https://nl.wikipedia.org/wiki/Sjoerd_Groenman ...,NL,founder,member_MC,vice-chair_BoD,,"Groenman, Sj.","Groenman, S"
2,REMP,1952,1969,3,Zeegers,,G.H.L.,"economist, sociologist, Nijmegen",academic,yes,1941-1950,https://www.ru.nl/kdc/bladeren/archieven-thema...,NL,founder,member_MC,member_BoD,,"Zeegers, G.H.L.","Zeegers, G"
3,REMP,1952,1969,4,Hofstee,,E.W.,"sociologist, Wageningen",academic,yes,"yes, advisor 5 ministeries",http://resources.huygens.knaw.nl/bwn1880-2000/...,NL,founder,member_BoD,,,"Hofstee, E.W.","Hofstee, E"
4,REMP,1952,1969,5,Bouman,,P.J.,"sociologist, Groningen",academic,yes,,"https://nl.wikipedia.org/wiki/P.J._Bouman, htt...",NL,member_BoD,,chair_BoD (1954),,"Bouman, P.J.","Bouman, P"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,ICEM,1970,1988,68,Maselli,,G.,deputy director general,,,,,IT,,,,,"Maselli, G.","Maselli, G"
70,ICEM,1989,1993,69,Charry-Samper,,H.,deputy director general,,,,,CO,,,,,"Charry-Samper, H.","Charry-Samper, H"
71,ICEM,1994,1999,70,Escaler,,N.L. (Narcisa),deputy director general,,,,,PH,,,,,"Escaler, N.L. (Narcisa)","Escaler, N"
72,ICEM,1999,2009,71,Ndioro,,N. (Ndiaye),deputy director general,,,,,SN,,,,,"Ndioro, N. (Ndiaye)","Ndioro, N"


In [368]:
from scripts.data_wrangling import yr2cat
board_df['period'] = board_df.apply(lambda x: yr2cat(x), axis=1)


In [369]:
decades = {1950:(1950, 1960),
 1960:(1960, 1970),
 1970:(1970, 1980),
 1980:(1980, 1990),
 1990:(1990, 2000),
 2000:(2000, 2010)
          }

def cutdecade(x, decade):
    result = False
    if x.right < decade[0]:
        return False
    if x.left > decade[1]:
        return False
    if x.left > decade[0] or x.right >= decade[0]:
        return True



for key in decades:
    decade = decades[key]
    board_df[key] = board_df.period.apply(lambda x: cutdecade(x, decade))

In [370]:
from scripts.data_wrangling import map_bool

decade_cols = [1950, 1960, 1970, 1980, 1990]
org_cols = ['author_surname_initial', 'organisation']
display_cols =  org_cols + decade_cols


temp_board_df = board_df[org_cols].merge(board_df[decade_cols].astype(int), left_index=True, right_index=True)
temp_board_df = temp_board_df.rename(columns={'dataset': 'cat'})
temp_board_df['in_board'] = 1
temp_board_df

Unnamed: 0,author_surname_initial,organisation,1950,1960,1970,1980,1990,in_board
0,"Beyer, G",REMP,1,1,1,1,0,1
1,"Groenman, S",REMP,1,1,0,0,0,1
2,"Zeegers, G",REMP,1,1,0,0,0,1
3,"Hofstee, E",REMP,1,1,0,0,0,1
4,"Bouman, P",REMP,1,1,0,0,0,1
...,...,...,...,...,...,...,...,...
69,"Maselli, G",ICEM,0,1,1,1,0,1
70,"Charry-Samper, H",ICEM,0,0,0,1,1,1
71,"Escaler, N",ICEM,0,0,0,0,1,1
72,"Ndioro, N",ICEM,0,0,0,0,1,1


In [371]:
decade_pub_df = pd.get_dummies(split_pub_df.issue_pub_decade)

temp_pub_df = split_pub_df[['author_surname_initial', 'dataset']].merge(decade_pub_df, left_index=True, right_index=True)
temp_pub_df = temp_pub_df.rename(columns={'dataset': 'cat'})
temp_pub_df = temp_pub_df.groupby(['author_surname_initial', 'cat']).sum().reset_index()
temp_pub_df['in_pub'] = 1
temp_pub_df



Unnamed: 0,author_surname_initial,cat,1950,1960,1970,1980,1990,in_pub
0,A.H.R.,IMR_research,0,0,1,0,0,1
1,"Abad, R",IMR_research,0,0,0,2,0,1
2,"Abadan-Unat, N",IMR_research,0,0,1,1,3,1
3,"Abadan-Unat, N",IMR_review,0,0,0,1,0,1
4,"Abalos, D",IMR_review,0,0,1,0,0,1
...,...,...,...,...,...,...,...,...
2847,"Zolberg, A",IMR_review,0,0,0,1,0,1
2848,"Zubrzycki, J",IMR_research,3,0,2,1,0,1
2849,"Zubrzycki, J",REMP_IM,0,5,0,1,0,1
2850,"Zucchi, J",IMR_review,0,0,0,1,0,1


In [372]:
#temp_pub_df.set_index('author_surname_initial')

temp_df = pd.concat([temp_board_df.rename(columns={'organisation': 'cat'}).set_index('author_surname_initial'), 
                     temp_pub_df.rename(columns={'dataset': 'cat'}).set_index('author_surname_initial')])

for name in temp_df.index:
    temp_df.loc[name,'in_pub'] = temp_df.loc[name,'in_pub'].max()
    temp_df.loc[name,'in_board'] = temp_df.loc[name,'in_board'].max()
temp_df = temp_df.reset_index()


In [373]:
from scripts.data_wrangling import highlight_decade

temp2_df = temp_df[(temp_df.in_board == 1) & (temp_df.in_pub == 1)].drop(['in_board', 'in_pub'], axis=1)
temp2_df.sort_values(by='author_surname_initial').style.apply(highlight_decade, axis=1)


Unnamed: 0,author_surname_initial,cat,1950,1960,1970,1980,1990
166,"Appleyard, R",REMP_IM,0,1,1,3,14
165,"Appleyard, R",IMR_research,2,0,0,1,0
49,"Appleyard, R",REMP,0,1,0,0,0
194,"Avila, F",REMP_IM,0,1,0,0,0
193,"Avila, F",IMR_research,1,0,0,0,0
47,"Avila, F",REMP,0,1,0,0,0
203,"Backer, J",REMP_IM,0,1,0,0,0
37,"Backer, J",REMP,1,1,0,0,0
312,"Besterman, W",REMP_IM,0,2,0,0,0
68,"Besterman, W",ICEM,0,1,1,0,0


In [378]:
board_df.columns

Index([             'organisation',              'period_start',
                 'last_known_date',                    'prs_id',
                     'prs_surname',                 'prs_infix',
                    'prs_initials',              'prs_function',
                    'prs_category',               'is_academic',
        'is_public_administration',                   'sources',
                     'prs_country',                 'prs_role1',
                       'prs_role2',                 'prs_role3',
                         'remarks', 'article_author_index_name',
          'author_surname_initial',                    'period',
                              1950,                        1960,
                              1970,                        1980,
                              1990,                        2000],
      dtype='object')

In [426]:
temp_df['total'] = temp_df[['1950', '190']].sum(axis=1)
.groupby('author_surname_initial').agg('sum')

Unnamed: 0_level_0,1950,1960,1970,1980,1990,in_board,in_pub
author_surname_initial,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
A.H.R.,0,0,1,0,0,0.0,1.0
"Abad, R",0,0,0,2,0,0.0,1.0
"Abadan-Unat, N",0,0,1,2,3,0.0,2.0
"Abalos, D",0,0,1,0,0,0.0,1.0
"Abell, N",0,0,0,0,1,0.0,1.0
...,...,...,...,...,...,...,...
"Zodgekar, A",0,0,0,0,1,0.0,1.0
"Zolberg, A",0,0,0,4,0,0.0,2.0
"Zubrzycki, J",3,5,2,2,0,0.0,2.0
"Zucchi, J",0,0,0,1,0,0.0,1.0


In [2]:
# calculate membership of groups for overview
# temp_df['in_total'] = temp_df[['1950']].sum(axis=1)
r_df = temp_df.groupby('author_surname_initial').agg('sum').sort_values('in_total', ascending=False)
vals = pd.DataFrame(r_df.value_counts())
vals['percentage'] = round((vals.in_total/2540)*100, 0)
vals

NameError: name 'temp_df' is not defined

In [423]:
r_df

Unnamed: 0_level_0,1950,1960,1970,1980,1990,in_board,in_pub,in_total
author_surname_initial,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"Beyer, G",11,9,5,3,0,4.0,4.0,8.0
"Maselli, G",0,2,3,1,0,3.0,3.0,6.0
"Thomas, J",0,2,1,1,0,3.0,3.0,6.0
"Thomas, B",0,3,2,0,0,3.0,3.0,6.0
"Haveman, B",1,6,0,0,0,3.0,3.0,6.0
...,...,...,...,...,...,...,...,...
"Hardjono, J",0,0,0,1,0,0.0,1.0,1.0
"Haritos-Fatouros, M",0,0,0,1,0,0.0,1.0,1.0
"Harles, J",0,0,0,0,1,0.0,1.0,1.0
"Harmeling, M",0,0,1,0,0,0.0,1.0,1.0


We calculated a score for persons in the different datasets. It consists score for all publication and board activities each.  
Most persons (87 %) were part of a single group and only one (Günter Beijer) was part of different groups in 

In [157]:
import pandas as pd

records_file = '../data/main-review-article-records.csv'

# load the csv data into a data frame
pub_df = pd.read_csv(records_file)


In [161]:
from scripts.data_wrangling import map_dataset

pub_df['dataset'] = pub_df.apply(lambda x: map_dataset(x['publisher'], x['article_type']), axis=1)
vals = pub_df.dataset.value_counts().to_frame()
vals

Unnamed: 0,dataset
IMR_review,1842
IMR_research,1678
REMP_IM,762


<hr>

## Outside of the main article, junkyard and editorial stuff

<hr>

In [75]:
import os
outdir = "/Users/rikhoekstra/Downloads/"
for item in grphdict:
    outfile = os.path.join(outdir, str(item) + '.gml') 
    nx.write_gml(grphdict[item], outfile)

In [85]:
import json
periodgraph, pos = period_graph(1950)
print(pos)

{'REMP': array([ 0.47645627, -0.24204383]), 'Wander, H.': array([ 0.07814179, -1.14993365]), 'Bouman, P.J.': array([-0.13040645, -0.9333079 ]), 'Beijer, G.': array([-0.39192904, -0.72436495]), 'Citroen, H.A.': array([ 1.5680785, -3.0344943]), 'Groenman, Sj.': array([ 1.64511349, -2.12243838]), 'Rappard, W.E.': array([ 1.01020975, -2.78173835]), 'Edding, F.': array([ 0.45804179, -1.04323302]), 'Salin, E.': array([ 0.69042329, -1.22250873]), 'Oudegeest, J.J.': array([-0.11189334, -0.09676409]), 'Sauvy, A.': array([-0.28321688, -0.39580113]), 'Brink, van den , T.': array([-0.45086167, -0.64607425]), 'Gadolin, de, A.': array([ 1.05121984, -0.94946601]), 'Zeegers, G.H.L.': array([ 1.15716194, -0.56204656]), 'HWvanLoon Fellowship, NL regering': array([-2.16464384,  2.80157607]), 'Institute of International Education': array([-1.97957795,  3.39383168]), 'Petersen, W.': array([-2.36093296,  3.39533529]), 'Hofstee, E.W.': array([-1.63211191,  2.94658584]), 'Davis, K.': array([-2.4868211,  3.002

In [79]:
commonnames

{1950: {'Beijer, G.',
  'Groenman, Sj.',
  'Hofstede, B.P.',
  'ICEM',
  'Mol, J.J.',
  'Nixon, J.W.',
  'RCE',
  'REMP',
  'Richardson, A.',
  'Timlin, M.F.',
  'Wentholt, R.'},
 1960: {'Beijer, G.',
  'Groenman, Sj.',
  'Hofstede, B.P.',
  'ICEM',
  'Mol, J.J.',
  'Nixon, J.W.',
  'RCE',
  'REMP',
  'Richardson, A.',
  'Timlin, M.F.',
  'Wentholt, R.'},
 1970: {'Beijer, G.', 'REMP'}}

## Next sections

[ Concepts and a genealogy: ICEM, REMP and the Dutch connection]

[discourse coalitions: Fischer (technocratie / discourse coalition: p. 22, 28, 30 (methode: artikelen, boeken, conferenties); 32-34, gebruikt wagner en Hajer, benadrukt de agenada settende macht )) Wagner; Lutz; Hajer; Technocracy. Problematiseren: Op welke manier geeft de selectie van de wetenschap het discourse weer (milieuproblematiek, Hajer 1993 (bredere focus dan Wagner/Lutz; 1995: benadrukt belang institutionele genealogie (p.5))
Argue that ICEM and REMP are interesting case study, because of difficulty to understand the character of ICEM; elaborate on periodization introduced by Lutz (NIOD-paper)
connection to recent historiography on ICEM: Ventura; Parsanaglou; Geiger. Combine with previous analysis Van Faassen 2014.]

[{ elaborate on analogue archival information_Collection Beyer:
REMP started as ‘Dutch-German bilateral group on refugees studies’, and evolved to European group in which ‘technocrats’ as Sauvy were invited. [NB: Integrate Philip Nord’s analysis on french technocracy (French model) with the ‘Dutch: very comparable background to Haveman (NB: ido de Haan; Nele Beyen)
- Beyen had contact with USA-delegates during Brussels Conference (= constitution of PICMME]

 during constitutative REMP conference 1952: representative of ICEM-dep. director Jacobsen (P.Jarrell, see memorandum, inv. 31, photo 201801054_163359) as observer: influencing REMP to focus not only on Europe, but also on overseas migration]

[analysis- connecting to notebooks
how do we translate our research questions into methodological ones: stuk RIK
What data sets do we need? Mention them in the narrative; justification in the notebooks (stukken Rik/Marijn): 
Constructing the datasets for networks manually
Meta data of journals IM / IMR for discovering discourse
NB: IMR will be used as control dataset, to make comparison possible: we do not elaborate on this journal specifically.

[The early years of REMP and ICEM: 1950s-1960]
Narrative: explanation of role Haveman in constitution of PICMME / ICEM: in the Netherlands; at the international theatre (connection Australia/ Canada) (Van Faassen 2014; 2017)
Narrative: Dutch governmental studies om emigration: supporting Dutch Policy (Haveman-Beyer); constitution of REMP in the Netherlands

[Analysis: Connecting to notebooks: NB: REMP-bulletin and IM are still 2 journals; IMR doesn’t exist yet.
What is the network; what is the discourse; how can we explain this with regard to previous research?

[1960-1980] <Haveman-Beyer period>
	Narrative: serious financial problems ICEM; change in policy USA towards development aid Latin -America; combination of both partly solved by Haveman / Australia together with george Warren USA: (Van Faassen, 2014, 2017 check op Ventura, Parsanoglu]

Haveman director-general ICEM; merger IM - REMP-bulletin

[Analysis: connecting to notebooks: what network (NB: board of directors is still known from analogue research) what discourse: explanation : policy change USA→ Latin America….

<1980-2000> Australia period 


Conclusions
ICEM/IOM is orgaan om wetenschappelijk serieus genomen te worden, en om humanitair imago te scheppen
Mensen in ICEM/IOM zitten ook in nationale regeringen -> nationale regeringen gebruiken ICEM als speelruimte voor remote control
terugkomen op quote Petersen: huisorgaan: dat kan nog steeds aangetoond op basis van recente reviews van ICEM.
discourse coalitions alleen als analytisch concept onvoldoende om ICEm helemaal te verklaren, maar leidt wel tot vervolg vragen


[1] <eventueel nog aandacht besteden aan feit dat niemand echt toegang krijgt tot het ICEM-archief: feldblum / steinert meent zelfs dat het vernietigd is…checken bij Ventura / Geiger>
[2] <Archiefverwijzing Statuten of data-sheet>.



- 1952 formal foundation of REMP (Beyer / sociologists like Hofstee); sociologists have an important influence on Dutch emigration from the moment of appointment of Haveman (nov.1950); it is integrated into the work of the _Commissariaat voor de Emigratie_ by way of _Studies over NL emigratie_
- 1958-1962 publication of the most important Dutch emigration studies (supporting policies)
- 1961-1967 Haveman became director-general of ICEM (with the help of USA and Australia) 
- ICEM was founded in 1952, the successor to PICMME and the predecessor of ICM (1980) and finally IOM (1989)
[https://www.iom.int/iom-history], 
also see schema in Van Faassen, _polder en emigratie_, 126 

[N] 
# Test

In [None]:
def some_func():
    print("hello")
    
some_func()