programminghistorian · anisa-hawes · Feb 10, 2023 · Feb 3, 2023 · Feb 3, 2023 · Feb 3, 2023
diff --git a/en/lessons/introduction-to-stylometry-with-python.md b/en/lessons/introduction-to-stylometry-with-python.md
@@ -53,7 +53,7 @@ Please note that the code in this lesson has been designed to run sequentially.
 
 ## Prior Reading
 
-If you do not have experience with the Python programming language or are finding examples in this tutorial difficult, the author recommends you read the lessons on [Working with Text Files in Python](/lessons/working-with-text-files) and [Manipulating Strings in Python](/lessons/manipulating-strings-in-python).
+If you do not have experience with the Python programming language or are finding examples in this tutorial difficult, the author recommends you read the lessons on [Working with Text Files in Python](/lessons/working-with-text-files) and [Manipulating Strings in Python](/lessons/manipulating-strings-in-python). Please note, that those lessons were written in Python version 2 whereas this one uses Python version 3. The differences in [syntax](https://en.wikipedia.org/wiki/Syntax) between the two versions of the language can be subtle. If you are confused at any time, follow the examples as written in this lesson and use the other lessons as background material. (More precisely, the code in this tutorial was written using [Python 3.6.4](https://www.python.org/downloads/release/python-364/); the [f-string construct](https://docs.python.org/3/whatsnew/3.6.html#pep-498-formatted-string-literals) in the line `with open(f'data/federalist_{filename}.txt', 'r') as f:`, for example, requires Python 3.6 or a more recent version of the language.)
 
 ## Required materials
 
@@ -153,7 +153,7 @@ Next, as we are interested in each author's vocabulary, we will define a short P
 def read_files_into_string(filenames):
     strings = []
     for filename in filenames:
-        with open(f'data/federalist_{filename}.txt') as f:
+        with open(f'data/federalist_{filename}.txt', 'r') as f:
             strings.append(f.read())
     return '\n'.join(strings)
 ```
@@ -191,6 +191,7 @@ The code required to calculate characteristic curves for the *Federalist*'s auth
 ```python
 # Load nltk
 import nltk
+nltk.download('punkt')
 %matplotlib inline
 
 # Compare the disputed papers to those written by everyone,
@@ -207,10 +208,10 @@ for author in authors:
     federalist_by_author_tokens[author] = ([token for token in tokens
                                             if any(c.isalpha() for c in token)])
 
-    # Get a distribution of token lengths
-    token_lengths = [len(token) for token in federalist_by_author_tokens[author]]
-    federalist_by_author_length_distributions[author] = nltk.FreqDist(token_lengths)
-    federalist_by_author_length_distributions[author].plot(15,title=author)
+# Get a distribution of token lengths
+token_lengths = [len(token) for token in federalist_by_author_tokens[author]]
+federalist_by_author_length_distributions[author] = nltk.FreqDist(token_lengths)
+federalist_by_author_length_distributions[author].plot(15,title=author)
 ```
 
 The '%matplotlib inline' declaration below 'import nltk' is required if your development environment is a [Jupyter Notebook](http://jupyter.org/), as it was for me while writing this tutorial; otherwise you may not see the graphs on your screen. If you work in [Jupyter Lab](http://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html), please replace this clause with '%matplotlib ipympl'.

diff --git a/fr/lecons/introduction-a-la-stylometrie-avec-python.md b/fr/lecons/introduction-a-la-stylometrie-avec-python.md
@@ -58,7 +58,7 @@ Veuillez noter que le code informatique de cette leçon a été conçu pour êtr
 
 ## Lectures préalables
 
-Si vous n'avez pas d'expérience de programmation en Python ou si vous trouvez les exemples dans ce tutoriel difficiles, l'auteur vous recommande de lire les leçons intitulées [Travailler avec des fichiers texte en Python](/fr/lecons/travailler-avec-des-fichiers-texte) et [Manipuler des chaînes de caractères en Python](/fr/lecons/manipuler-chaines-caracteres-python). 
+Si vous n'avez pas d'expérience de programmation en Python ou si vous trouvez les exemples dans ce tutoriel difficiles, l'auteur vous recommande de lire les leçons intitulées [Travailler avec des fichiers texte en Python](/fr/lecons/travailler-avec-des-fichiers-texte) et [Manipuler des chaînes de caractères en Python](/fr/lecons/manipuler-chaines-caracteres-python). Notez aussi que ces leçons ont à l'origine été rédigées en Python 2 tandis que ce tutoriel utilise Python 3. Les différences de [syntaxe](https://fr.wikipedia.org/wiki/Syntaxe) entre les deux versions du langage peuvent être subtiles. En cas de conflit, suivez les exemples tels qu'ils sont codés dans le présent tutoriel et n'utilisez les autres ressources qu'à titre indicatif. (Plus précisément, le code intégré à ce tutoriel a été écrit en [Python 3.6.4](https://www.python.org/downloads/release/python-364/); la chaîne de type [f-string](https://docs.python.org/3/whatsnew/3.6.html#pep-498-formatted-string-literals) qui apparaît dans la ligne `with open(f'data/federalist_{nom_fichier}.txt', 'r') as f:`, par exemple, requiert Python 3.6 ou une version plus récente du langage.)
 
 ## Matériel requis
 
@@ -159,7 +159,7 @@ Ensuite, puisque nous nous intéressons au vocabulaire employé par chaque auteu
 def lire_fichiers_en_chaine(noms_fichiers):
     chaines = []
     for nom_fichier in noms_fichiers:
-        with open(f'data/federalist_{nom_fichier}.txt') as f:
+        with open(f'data/federalist_{nom_fichier}.txt', 'r') as f:
             chaines.append(f.read())
     return '\n'.join(chaines)
 ```
@@ -198,6 +198,7 @@ Le code requis pour calculer les courbes caractéristiques des auteurs du _Féd
 ```python
 # Charger nltk
 import nltk
+nltk.download('punkt')
 %matplotlib inline
 
 # Comparons les articles contestés à ceux écrits par chaque
@@ -215,10 +216,10 @@ for auteur in auteurs:
                                             if any(c.isalpha() for c in occ)])
 
 
-    # Obtenir et dessiner la distribution des fréquences de longueurs
-    occs_longueurs = [len(occ) for occ in federalist_par_auteur_occs[auteur]]
-    federalist_par_auteur_dist_longueurs[auteur] = nltk.FreqDist(occs_longueurs)
-    federalist_par_auteur_dist_longueurs[auteur].plot(15,title=auteur)
+# Obtenir et dessiner la distribution des fréquences de longueurs
+occs_longueurs = [len(occ) for occ in federalist_par_auteur_occs[auteur]]
+federalist_par_auteur_dist_longueurs[auteur] = nltk.FreqDist(occs_longueurs)
+federalist_par_auteur_dist_longueurs[auteur].plot(15,title=auteur)
 ```
 
 La clause `%matplotlib inline` sous la ligne `import nltk` est nécessaire si vous travaillez dans un environnement de développement [Jupyter Notebook](https://jupyter.org/), comme c'était le cas pour moi lorsque j'ai rédigé ce tutoriel; en son absence, les graphes pourraient ne pas apparaître à l'écran. Si vous travaillez plutôt dans [Jupyter Lab](https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html), veuillez remplacer cette clause par `%matplotlib ipympl`.

diff --git a/pt/licoes/introducao-estilometria-python.md b/pt/licoes/introducao-estilometria-python.md
@@ -56,7 +56,7 @@ No final desta lição, teremos percorrido os seguintes tópicos:
 
 ## Leitura prévia
 
-Se você não tem experiência com a linguagem de programação Python ou está tendo dificuldade nos exemplos apresentados neste tutorial, o autor recomenda que você leia as lições [Trabalhando com ficheiros de texto em Python](/pt/licoes/trabalhando-ficheiros-texto-python) e [Manipular Strings com Python](/pt/licoes/manipular-strings-python). Note que essas lições foram escritas em Python versão 2, enquanto esta usa Python versão 3. As diferenças de [sintaxe](https://perma.cc/E5LQ-S65P) entre as duas versões da linguagem podem ser sutis. Se você ficar em dúvida, siga os exemplos conforme descritos nesta lição e use as outras lições como material de apoio. (Este tutorial encontra-se atualizado até à versão [Python 3.8.5](https://perma.cc/XCT2-Q4AT); as [strings literais formatadas](https://perma.cc/U6Q6-59V3) na linha `with open(f'data/pg{filename}.txt', encoding='utf-8') as f:`, por exemplo, requerem Python 3.6 ou uma versão mais recente da linguagem.) 
+Se você não tem experiência com a linguagem de programação Python ou está tendo dificuldade nos exemplos apresentados neste tutorial, o autor recomenda que você leia as lições [Trabalhando com ficheiros de texto em Python](/pt/licoes/trabalhando-ficheiros-texto-python) e [Manipular Strings com Python](/pt/licoes/manipular-strings-python). Note que essas lições foram escritas em Python versão 2, enquanto esta usa Python versão 3. As diferenças de [sintaxe](https://perma.cc/E5LQ-S65P) entre as duas versões da linguagem podem ser sutis. Se você ficar em dúvida, siga os exemplos conforme descritos nesta lição e use as outras lições como material de apoio. (Este tutorial encontra-se atualizado até à versão [Python 3.8.5](https://perma.cc/XCT2-Q4AT); as [strings literais formatadas](https://perma.cc/U6Q6-59V3) na linha `with open(f'data/pg{filename}.txt', 'r', encoding='utf-8') as f:`, por exemplo, requerem Python 3.6 ou uma versão mais recente da linguagem.) 
 
 ## Materiais requeridos
 
@@ -146,7 +146,7 @@ def ler_ficheiros_para_string(ids_ficheiros):
     global texto
     strings = []
     for id_ficheiro in ids_ficheiros:
-        with open(f'dados/pg{id_ficheiro}.txt',
+        with open(f'dados/pg{id_ficheiro}.txt', 'r',
 		encoding='utf-8') as f:
             texto = f.read()
             texto = re.search(r"(START.*?\*\*\*)(.*)(\*\*\* END)", 
@@ -191,6 +191,7 @@ O trecho de código necessário para calcular e exibir as curvas característica
 ```python
 # Carregar nltk e matpotlib
 import nltk
+nltk.download('punkt')
 import matplotlib.pylab as plt
 
 obras_tokens = {}
@@ -209,9 +210,9 @@ for autor in autores:
     obras_tokens[autor] = ([token for token in tokens
                                             if any(c.isalpha() for c in token)])
 
-    # Obter a distribuição de comprimentos de tokens
-    token_comprimentos = [len(token) for token in obras_tokens[autor]]
-    obras_distribuicao_comprimento[autor] = nltk.FreqDist(token_comprimentos)
+# Obter a distribuição de comprimentos de tokens
+token_comprimentos = [len(token) for token in obras_tokens[autor]]
+obras_distribuicao_comprimento[autor] = nltk.FreqDist(token_comprimentos)
 
     # Plotar a curva característica de composição
     lista_chaves = []