Translate planstats.xml

gleu · Jun 19, 2017 · ac2533a · ac2533a
1 parent f018012
commit ac2533a
Showing 1 changed file with 44 additions and 41 deletions.
diff --git a/postgresql/planstats.xml b/postgresql/planstats.xml
@@ -451,29 +451,31 @@ rows = (outer_cardinality * inner_cardinality) * selectivity
  </sect1>
 
  <sect1 id="multivariate-statistics-examples">
-  <title>Multivariate Statistics Examples</title>
+  <title>Exemples de statistiques multivariées</title>
 
   <indexterm>
-   <primary>row estimation</primary>
-   <secondary>multivariate</secondary>
+   <primary>estimation de ligne</primary>
+   <secondary>multivariée</secondary>
   </indexterm>
 
   <sect2>
-   <title>Functional Dependencies</title>
+   <title>Dépendances fonctionnelles</title>
 
    <para>
-    Multivariate correlation can be demonstrated with a very simple data set
-    &mdash; a table with two columns, both containing the same values:
+    La corrélation multivariée peut être démontrée avec un jeu de test très
+    simple &mdash; une table avec deux colonnes, chacune contenant les même
+    valeurs :
 
 <programlisting>
 CREATE TABLE t (a INT, b INT);
 INSERT INTO t SELECT i % 100, i % 100 FROM generate_series(1, 10000) s(i);
 ANALYZE t;
 </programlisting>
 
-    As explained in <xref linkend="planner-stats"/>, the planner can determine
-    cardinality of <structname>t</structname> using the number of pages and
-    rows obtained from <structname>pg_class</structname>:
+    Comme expliqué dans <xref linkend="planner-stats"/>, l'optimiseur peut
+    déterminer la cardinalité de <structname>t</structname> en utilisant le
+    nombre de pages et de lignes obtenues dans
+    <structname>pg_class</structname> :
 
 <programlisting>
 SELECT relpages, reltuples FROM pg_class WHERE relname = 't';
@@ -483,13 +485,13 @@ SELECT relpages, reltuples FROM pg_class WHERE relname = 't';
        45 |     10000
 </programlisting>
 
-    The data distribution is very simple; there are only 100 distinct values
-    in each column, uniformly distributed.
+    La distribution des données est très simple; il n'y a que 100 valeurs
+    différentes dans chaque colonne, distribuées de manière uniforme.
    </para>
 
    <para>
-    The following example shows the result of estimating a <literal>WHERE</literal>
-    condition on the <structfield>a</structfield> column:
+    L'exemple suivant montre le résultat de l'estimation d'une conditino
+    <literal>WHERE</literal> sur la colonne <structfield>a</structfield> :
 
 <programlisting>
 EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
@@ -500,13 +502,14 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
    Rows Removed by Filter: 9900
 </programlisting>
 
-    The planner examines the condition and determines the selectivity
-    of this clause to be 1%.  By comparing this estimate and the actual
-    number of rows, we see that the estimate is very accurate
-    (in fact exact, as the table is very small).  Changing the
-    <literal>WHERE</literal> to use the <structfield>b</structfield> column, an identical
-    plan is generated.  But observe what happens if we apply the same
-    condition on both columns, combining them with <literal>AND</literal>:
+    L'optimiseur examine la condtion et détermine que la sélectivité de cette
+    clause est d' 1%.  En comparant cette estimation avec le ne nombre de ligne
+    réelle, on voit que l'estimaation est très précise (elle est en fait exact,
+    car la table est très petite).  En changeant la clause
+    <literal>WHERE</literal> pour utiliser la colonne
+    <structfield>b</structfield>, un plan identique est généré.  Mais observons
+    ce qui arrive si nous appliquons la même conditino sur chacune des
+    colonnes, en les combinant avec <literal>AND</literal> :
 
 <programlisting>
 EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
@@ -517,18 +520,19 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
    Rows Removed by Filter: 9900
 </programlisting>
 
-    The planner estimates the selectivity for each condition individually,
-    arriving at the same 1% estimates as above.  Then it assumes that the
-    conditions are independent, and so it multiplies their selectivities,
-    producing a final selectivity estimate of just 0.01%.
-    This is a significant underestimate, as the actual number of rows
-    matching the conditions (100) is two orders of magnitude higher.
+    L'optimiseur estime la sélectivité pour chaque condition individuellement,
+    en arrivant à la même estimation d'1% comme au dessus.  Puis il part du
+    principe que les conditions sont indépendantes, et multiple donc leurs
+    sélectivité, produisant une estimation de sélectivité finale d'uniquement
+    0.01%.  C'est une sous estimation importante, puisque le nombre réel de
+    lignes correspondant aux conditions (100) est d'un ordre de grandeur deux
+    fois plus haut.
    </para>
 
    <para>
-    This problem can be fixed by creating a statistics object that
-    directs <command>ANALYZE</command> to calculate functional-dependency
-    multivariate statistics on the two columns:
+    Ce problème peut être corrigé en créant un objet statistiques qui demandera
+    à <command>ANALYZE</command> de calculer des statistiques multivariée de
+    dépendances fonctionnelles sur les deux colonnes :
 
 <programlisting>
 CREATE STATISTICS stts (dependencies) ON a, b FROM t;
@@ -544,15 +548,14 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
   </sect2>
 
   <sect2>
-   <title>Multivariate N-Distinct Counts</title>
+   <title>Nombre N-Distinct Multivarié</title>
 
    <para>
-    A similar problem occurs with estimation of the cardinality of sets of
-    multiple columns, such as the number of groups that would be generated by
-    a <command>GROUP BY</command> clause.  When <command>GROUP BY</command>
-    lists a single column, the n-distinct estimate (which is visible as the
-    estimated number of rows returned by the HashAggregate node) is very
-    accurate:
+    Un problème similaire apparaît avec l'estimation de la cardinalité d'un
+    ensemble de plusieurs colonnes, tel que le nombre de groupes qu'une clause
+    <command>GROUP BY</command> générerait.  Quand <command>GROUP BY</command>
+    liste une seule colonne, l'estimation n-distinct (qui est visible comme le
+    nombre de lignes estimé par le nœud HashAggregate) est très précis :
 <programlisting>
 EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
                                        QUERY PLAN                                        
@@ -561,9 +564,9 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
    Group Key: a
    -&gt;  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=4) (actual rows=10000 loops=1)
 </programlisting>
-    But without multivariate statistics, the estimate for the number of
-    groups in a query with two columns in <command>GROUP BY</command>, as
-    in the following example, is off by an order of magnitude:
+    Mais sans statistiques multivariées, l'estimation du nombre de groupe dans
+    une requête ayant deux colonnes dans le <command>GROUP BY</command>, comme
+    dans l'exemple suivant, est faux d'un ordre de grandeur :
 <programlisting>
 EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
                                        QUERY PLAN                                        
@@ -572,8 +575,8 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
    Group Key: a, b
    -&gt;  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
 </programlisting>
-    By redefining the statistics object to include n-distinct counts for the
-    two columns, the estimate is much improved:
+    En redéfinissant l'objet statistiques pour inclure un nombre n-distinct
+    pour les deux colonnes, l'estimation est bien améliorée :
 <programlisting>
 DROP STATISTICS stts;
 CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;