Background theory

Introduction

The EM1DTM and EM1DTMFWD programs are designed to interpret time-domain, small loop, electromagnetic data over a 1D layered Earth. These programs model the Earth's time-domain electromagnetic response due to a small inductive loop source which carries a sinusoidal time-varying current. The data are the secondary magnetic field which results from currents and magnetization induced in the Earth.

Details regarding the source and receiver

EM1DTM and EM1DTMFWD assume that the transmitter loop is horizontal. The programs also assume that each receiver loop is sufficiently small and that they may be considered point receivers; i.e. the spatial variation in magnetic flux through the receiver loop is negligible.

Details regarding the domain

Layered 1D model describing the Earth for each sounding.

EM1DTM and EM1DTMFWD model the Earth's response for measurements above a stack of uniform, horizontal layers. The coordinate system used for describing the Earth models has z as positive downwards, with the surface of the Earth at z=0. The z-coordinates of the source and receiver locations (which must be above the surface) are therefore negative. Each layer is defined by a thickness and an electrical conductivity. It is the physical properties of the layers that are obtained during the inversion of all measurements gathered at a single location, with the depths to the layer interfaces remaining fixed. If measurements made at multiple locations are being interpreted, the corresponding one-dimensional models are juxtaposed to create a two-dimensional image of the subsurface.

All measurements that are to be inverted for a single one- dimensional model must be grouped together as a “sounding”. A sounding can be considered a distinct collection of TEM measurements (receivers and times) for a given transmitter. Each different sounding is inverted for a separate one-dimensional model.

Two distinct groupings of transmitters and receivers (soundings) at the same location (left). Different soundings used to map lateral variation in the Earth (right). Click to enlarge.

The electrical conductivity of Earth materials varies over several orders of magnitude, making it more natural to invert for the logarithms of the layer conductivities rather than the conductivities themselves. This also ensures that conductivities in the constructed model are positive.

Details regarding the data

For a horizontal loop source, EM1DTM and EM1DTMFWD can handle any combination of:

times for the transmitter current

separations and heights for the transmitter and receiver(s)

B-field or time varying (dBdt) response

x, y and z-components of the response

Programs EM1DTM and EM1DTMFWD accept observations in two different forms: values of the secondary magnetic field H-field in A/m and values of the total H-field in A/m. If the transmitter and receiver have the same orientation, the x-, y- and z-components of the secondary field are normalized by the x-, y- and z-components of the free-space field respectively. If the transmitter and receiver have different orientations, the secondary field is normalized by the magnitude of the free-space field.

Forward Modeling

The method used to compute the magnetic field values for a particular source-receiver arrangement over a layered Earth model is the matrix propagation approach described in Farquharson (Farquharson2003). The method uses the z-component of the Schelkunoff F-potential (Ward1987): layered Earth model is the matrix propagation approach described in Farquharson & Oldenburg (1996) and Farquharson, Oldenburg & Routh (2003). Computations are done in the frequency domain, then the ﬁelds transformed to the time domain.

This section covers the aspects of the forward-modelling procedure for the three components of the magnetic field above a horizontally-layered Earth model for a horizontal, many-sided transmitter loop also above the surface of the model. The field for the transmitter loop is computed by the superposition of the fields due to horizontal electric dipoles (see eqs.~4.134--4.152 Ward1987). Because the loop is closed, the contributions from the ends of the electric dipole are ignored, and the superposition is carried out only over the TE-mode component. This TE-mode only involves the z-component of the Schelkunoff F-potential, just as for program EM1DFM.

The Schelkunoﬀ F-potential is deﬁned as follows:

$$\begin{aligned} \mathbf{E} = - \nabla \times \mathbf{F} \\\ \mathbf{H} = - \sigma \mathbf{F} + \frac{1}{i \omega \mu }\nabla (\nabla \cdot \mathbf{F}) \end{aligned}$$

where E and H are the electric and mangetic fields, σ and μ are the conductivity and permability of the uniform region to which the above queation refer, and a time-dependence of e^iωt has been assumed.

In the j^th layer (j > 0), with conductivity σj , and permeability µ_j, the z-component of the Schelkunoﬀ potential satisﬁes the equation (assuming the quasi-static approximation, and the permeability of the layer is equal to that of free space μ₀):

∇²F_j − iωμ₀σ_jF_j = 0

The propagation of F through the stack of layers therefore happens in exactly the same way, and so is not repeated here (see EM1DFM Forward modeling section <http://em1dfm.readthedocs.io/en/latest/content/theory.html#forward- modeling>).

Note

Assumptions - e^iωt time-dependence (just as in Ward1987) - quasi-static approximation throughout - z positive downwards - air halfspace (σ = 0) for z<0, piecewise constant model (σ > 0) of N layers for zge0, N^th layer being the basement (i.e.~homogeneous) halfspace - magnetic permeability everywhere equal to that of free space.

From the propagation of F through the layers gives the following expression for the kernel of the Hankel transform, F̃, in the air halfspace (z < 0):

$$\tilde{F}_0\;=D_0^S\Big(e^{-u_0z}\;+\;{P_{21}\over P_{11}}e^{u_0z}\Big)$$

(same as here in EM1DFM).

For a horizontal x-directed electric dipole at a height h (i.e., z = − h, h > 0) above the surface of the layered Earth, the downward-decaying part of the primary solution of F̃ (and the only downward-decaying part of the solution in the air halfspace) at the surface of the Earth (z = 0) is given by

$$D_0^S\;=\;-\,{i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\>e^{-u_0h}$$

from Ward1987, equation (4.137). Substituting A-2 into A-1 gives

$$\tilde{F}_0\;=-\,{i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0(z+h)}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big).$$

Generalizing this expression for z above (z < − h) as well as below the source (z > − h):

$$\tilde{F}_0\;=-\,{i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big).$$

Applying the inverse Fourier transform to A-4 gives

$$F_0(x,y,z,\omega)\;=\;-\,{1\over4\pi^2}\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y$$

(equation (4.139) of Ward1987). Using the identity

∫_∞^∞∫_∞^∞F̃(k_x² + k_y²) dk_x dk_y = 2π∫₀^∞F̃(λ) λ J₀(λr) dλ,

(Ward1987, equation (2.10)) where λ² = k_x² + k_y² and r² = x² + y², A-5 can be rewritten as

$$F_0(x,y,z,\omega)\;=\;-\,{1\over2\pi}\,{\partial\over\partial y}\,\int_0^{\infty} {i\omega\mu_0\over 2u_0}\>{1\over\lambda^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, \lambda\,J_0(\lambda r)\,d\lambda,$$

$$=\;-\,{i\omega\mu_0\over4\pi}\,{\partial\over\partial y}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda^2}\,J_0(\lambda r)\,d\lambda,$$

$$=\;{i\omega\mu_0\over4\pi}\,{y\over r}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda}\,J_1(\lambda r)\,d\lambda$$

since

$${\partial J_0(\lambda r)\over\partial y}\;=\;-\,\lambda{y\over r}\,J_1(\lambda r)$$

(Ward1987, equation 4.44 (almost)).

The H-field in the air halfspace can be obtained from A-9 (or A-8) by using equation (1.130) of Ward1987:

$$H_x\;=\;{1\over i\omega\mu_0}\,{\partial^2F_0\over\partial x\partial z},$$

$$H_y\;=\;{1\over i\omega\mu_0}\,{\partial^2F_0\over\partial y\partial z},$$

$$H_z\;=\;{1\over i\omega\mu_0}\,\Big({\partial^2\over\partial z^2}\,+\,\kappa_0^2\Big) \,F_0$$

$$=\;{1\over i\omega\mu_0}\,{\partial^2F_0\over\partial z^2}.$$

since κ₀² = 0. Applying A-11 to A-9 gives

$$H_x(x,y,z,\omega)\;=\;{1\over4\pi}\,{\partial\over\partial x}\,{y\over r}\,\int_0^{\infty} \,\Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_1(\lambda r)\,d\lambda.$$

(The plus/minus is to do with whether or not the observation location is above or below the source. In the program, it perhaps it is only the secondary fields that are computed using the above expressions: the primary field, which corresponds to the first term in each Hankel transform kernel above is computed using its for in (x, y, z)-space.) When the above expression for a horizontal electric dipole is integrated along a wire all that is left is the effects of the endpoints. These will cancel when integrating around the closed loop. So as far as the part of H_x that contributes to the file due to a closed loop:

H_x(x, y, z, ω) = 0.

For the y-component of the H-field, first consider differentiating the expression for F₀ in A-5 with respect to y:

$${\partial F_0\over\partial y}\;=\;-\,{1\over4\pi^2}\,{\partial\over\partial y}\, \int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y,$$

$$=\;{1\over4\pi^2}\,\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{k_y^2\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y,$$

$$=\;{1\over4\pi^2}\,\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y$$

$$\qquad-\;{1\over4\pi^2}\,\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{k_x^2\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y,$$

since

$${k_y^2\over k_x^2+k_y^2}\;=\;1\>-\>{k_x^2\over k_x^2+k_y^2}.$$

Converting the k_x² into derivatives with respect to x, and converting the two-dimensional Fourier transforms to Hankel transforms gives

$${\partial F_0\over\partial y}\;=\;{i\omega\mu_0\over4\pi}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_0(\lambda r)\;d\lambda$$

$$\qquad\qquad+\;{i\omega\mu_0\over4\pi}\,{\partial^2\over\partial x^2}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda^2}\>J_0(\lambda r)\;d\lambda,$$

$$=\;{i\omega\mu_0\over4\pi}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_0(\lambda r)\;d\lambda$$

$$\qquad\qquad-\;{i\omega\mu_0\over4\pi}\,{\partial\over\partial x}\,{x\over r}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda}\>J_1(\lambda r)\;d\lambda,$$

using equations (4.144) and (4.117) of Ward1987. Differentiating A-22 with respect to z and scaling by iωμ₀ (see A-12) gives

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,\int_0^{\infty} \Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_0(\lambda r)\;d\lambda$$

$$\qquad\qquad-\;{1\over4\pi}\,{\partial\over\partial x}\,{x\over r}\,\int_0^{\infty} \Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_1(\lambda r)\;d\lambda$$

(equation 4.150 of Ward1987). The second integral in the above expression only contributes at the ends of the dipole. So the only part of H_y required to compute the field due to the closed loop is

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,\int_0^{\infty} \Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_0(\lambda r)\;d\lambda.$$

Finally, applying A-14 to A-9 gives the z-component of the H-field:

$$H_z(x,y,z,\omega)\;=\;{1\over4\pi}\,{y\over r}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_1(\lambda r)\,d\lambda$$

(equation (4.152) of Ward1987).

A-24 and A-25 are for the total H-field (H_x = 0 from A-16 for an x-directed electric dipole excluding the effects at the end-points, that is, the wholespace field up in the air plus the field due to currents induced in the layered Earth. In A-24 and A-25, the first part of the kernel of the Hankel transform corresponds to the primary wholespace field:

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,\int_0^{\infty} \pm\,e^{-\lambda|z+h|}\, \lambda\,J_0(\lambda r)\;d\lambda,$$

$$=\;{1\over4\pi}\,{\partial\over\partial z}\,\int_0^{\infty} e^{-\lambda|z+h|}\, J_0(\lambda r)\;d\lambda,$$

and

$$H_z(x,y,z,\omega)\;=\;{1\over4\pi}\,{y\over r}\,\int_0^{\infty} e^{-\lambda|z+h|}\, \lambda\,J_1(\lambda r)\,d\lambda$$

$$=\;-\,{1\over4\pi}\,{y\over r}\,{\partial\over\partial r}\,\int_0^{\infty} e^{-\lambda|z+h|}\, J_0(\lambda r)\,d\lambda.$$

From Ward1987 equation (4.53), the integral in the above two expressions can be done:

$$\int_0^{\infty}e^{-\lambda|z+h|}\,J_0(\lambda r)\,d\lambda\;=\; {1\over\big(r^2+(z+h)^2\big)^{1/2}}.$$

So

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,{\partial\over\partial z}\, {1\over\big(r^2+(z+h)^2\big)^{1/2}},$$

$$=\;-\,{1\over4\pi}\, {z\over\big(r^2+(z+h)^2\big)^{3/2}}$$

(equation (2.42) of Ward1987 for σ = 0), and

$$H_z(x,y,z,\omega)\;=\;-\,{1\over4\pi}\,{y\over r}\,{\partial\over\partial r}\, {1\over\big(r^2+(z+h)^2\big)^{1/2}},$$

$$=\;{1\over4\pi}\,{y\over r}\, {r\over\big(r^2+(z+h)^2\big)^{3/2}},$$

$$=\;{1\over4\pi}\, {y\over\big(r^2+(z+h)^2\big)^{3/2}}$$

(equation (2.42) of Ward1987 for σ = 0).

Frequency- to time-domain transformation -- Part I

The solution for the H-field in the frequency domain for a delta-function source in time (and hence a flat, constant, real source term in the frequency domain) is, for example,

$$H_z(x,y,z,\omega)\;=\;{1\over4\pi}\,{y\over r}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_1(\lambda r)\,d\lambda.$$

Doing the inverse Fourier transform of these kinds of expressions does not encounter any subtleties, and gives an H-field as a function of time that, schematically, looks like:

S(t) = δ(t)	→	G^h(t)

This is the basic response that program EM1DTM computes. Notation of G^h(t) because this is the Green's function for convolution with the transmitter current waveform S(t) to give the H-field:

h(t) = ∫_{t^′ = − ∞}^∞G^h(t − t^′) S(t^′) dt^′.

The H-field for the delta-function source, that is, G^h certainly exists for t > 0. Also, it is certainly zero for t < 0. And at t = 0, it certainly is not infinite (not physical). Let's re-describe the function G^h (shown in the diagram above) as

G^h(t) = X(t) G̃^h(t),

where G̃^h(t) is equal to G^h for t > 0, G̃^h(0) = lim_t → 0+G^h, and does anything it wants for t < 0. And X(t) is the Heaviside function. This moves all issues about what is happening at t = 0 into the Heaviside function.

For measurements of voltage, the Green's function (impulse response) that is required is the time derivative of G^h (and for all t, not just t > 0). Schematically:

S(t) = δ(t)	→	G^V(t)

In terms of math:

V(t) = ∫_{t^′ = − ∞}^∞G^V(t − t^′) S(t^′) dt^′.

Let's take the time derivative of AA-2 to get the full expression for G^V:

$$\begin{aligned} G^V(t)\;&=\;{dG^h\over dt}, \\\ &=\;{d\over dt}\big(X\,\tilde{G}^h\big)\\\ &=\;X\,{d\tilde{G}^h\over dt}\;+\;\delta\,\tilde{G}^h, \end{aligned}$$

where δ is the delta function. Now, this is not a time derivative that should be happening numerically. So, given the basic G^h(t) and some representation of the transmitter current waveform S(t), program EM1DTM currently uses the re-arrangement of AA-3 given by the substitution of AA-4 into AA-3 followed by some integration by parts:

$$\begin{aligned} V(t)\;&=\;\int_{t^{\prime}=-\infty}^{\infty} \Big\{X(t-t^{\prime})\,{d\tilde{G}^h\over dt^{\prime}}(t-t^{\prime})\;+\; \delta(t-t^{\prime})\,\tilde{G}^h(t-t^{\prime})\Big\} \,S(t^{\prime})\>dt^{\prime},\\\ &=\;\tilde{G}^h(0)\,S(t)\;+\; \int_{t^{\prime}=-\infty}^t{d\tilde{G}^h\over dt^{\prime}}(t-t^{\prime})\,S(t^{\prime})\>dt^{\prime}, \end{aligned}$$

where the Heaviside function has been used to restrict the limits of the integration. Now doing the integration by parts:

$$\begin{aligned} V(t)\;&=\;\tilde{G}^h(0)\,S(t)\;+\; \Big[\tilde{G}^h(t-t^{\prime})\,S(t^{\prime})\Big]_{t^{\prime}=-\infty}^t\;-\; \int_{t^{\prime}=-\infty}^t\tilde{G}^h(t-t^{\prime})\,{dS\over dt^{\prime}}(t^{\prime})\>dt^{\prime} \\\ &=\;\tilde{G}^h(0)\,S(t)\;+\; \tilde{G}^h(0)\,S(t)\;-\; \int_{t^{\prime}=-\infty}^t\tilde{G}^h(t-t^{\prime})\,{dS\over dt^{\prime}}(t^{\prime})\>dt^{\prime}. \end{aligned}$$

Which looks as though it has the expected additional non-convolution-integral term.

However, perhaps there should be an additional minus sign in going from AA--4 to the one before AA-5 because the derivative has changed from d/dt to d/dt^′. But perhaps not.

Frequency- to time-domain transformation

The Fourier transform that was applied to Maxwell's equations to get the frequency-domain equations was (see Ward1987, equation (1.1))

F(ω) = ∫_− ∞^∞f(t) e^− iωtdt,

and the corresponding inverse transform is

$$f(t)\;=\;{1\over2\pi}\int_{-\infty}^{\infty}F(\omega)\>e^{i\omega t}d\omega.$$

For the frequency domain computations, it is assumed that the source term is the same for all frequencies. In other words, a flat spectrum, which corresponds to a delta-function time-dependence of the source.

Consider at the moment a causal signal, that is, one for which f(t) = 0 for t < 0. The Fourier transform of this signal is then

$$\begin{aligned} F(\omega)\;&=\;\int_0^{\infty}f(t)\>e^{-i\omega t}dt \\\ &=\;\int_0^{\infty}f(t)\>\cos\,\omega t\>dt\;-\;i\,\int_0^{\infty}f(t)\>\sin\,\omega t\>dt. \end{aligned}$$

Note that because of the dependence of the real part of F(ω) on cos ωt and of the imaginary part on sin ωt, the real part of F(ω) is even and the imaginary part of F(ω) is odd. Hence, f(t) can be obtained from either the real or imaginary part of its Fourier transform via the inverse cosine or sine transform:

$$\begin{aligned} f(t)\;&=\;{2\over\pi}\int_0^{\infty} {\rm Re}\,F(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ f(t)\;&=\;-\,{2\over\pi}\int_0^{\infty} {\rm Im}\,F(\omega)\>\sin\,\omega t\>d\omega \end{aligned}$$

(For factor of 2/π see, for example, Arfken.)

Now consider that we've computed the H-field in the frequency domain for a uniform source spectrum. Then from the above expressions, the time-domain H-field for a ^th positive delta-function} source time-dependence is

$$\begin{aligned} h_{\delta+}(t)\;&=\;{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ h_{\delta+}(t)\;&=\;-\,{2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>\sin\,\omega t\>d\omega \end{aligned}$$

where H(ω) is the frequency-domain H-field for the uniform source spectrum. For a ^th negative delta-function} source:

$$\begin{aligned} h_{\delta-}(t)\;&=\; -\,{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ h_{\delta-}(t)\;&=\;{2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>\sin\,\omega t\>d\omega. \end{aligned}$$

The negative delta-function source dependence is the derivative with respect to time of a step turn-off source dependence. Hence, the ^th derivative} of the time-domain H-field due to a ^th step turn-off} is also given by the above expressions:

$$\begin{aligned} {\partial h_{\rm s}\over\partial t}(t)\;&=\; -\,{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ {\partial h_{\rm s}\over\partial t}(t)\;&=\; {2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>\sin\,\omega t\>d\omega. \end{aligned}$$

Integrating the above two expressions gives the H-field for a ^th step turn-off} source:

$$\begin{aligned} h_{\rm s}(t)\;&=\;h(0)\> -\>{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>{1\over\omega}\,\sin\,\omega t\>d\omega,\quad{\rm or}\\\ h_{\rm s}(t)\;&=\; -\,{2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>{1\over\omega}\,\cos\,\omega t\>d\omega. \end{aligned}$$

(See also Newman, Hohmann and Anderson, and Kaufman and Keller for all this.)

Thinking in terms of the time-domain inhomogeneous differential equation:

$$\begin{aligned} L\,h_{\delta-}\;&=\;\delta_- \\\ \Rightarrow\quad L\,h_{\delta-}\;&=\;{\partial\over\partial t}H_{\rm o} \\\ \Rightarrow\quad L\,{\partial h_s\over\partial t}\;&=\;{\partial\over\partial t}H_{\rm o}. \end{aligned}$$

Fake / equivalent world			Real World
	& h(t)	⇔		$\;\&\;\mathbf{\frac{\partial h}{\partial t}(t)}$
	& h(t)	⇔		$\;\&\;\mathbf{\frac{\partial h}{\partial t}(t)}$

Top left is what we know (flat frequency spectrum for the source and sine transform of the imaginary part of the field), and top right is what we're after. Also, bottom right is obtained from top left by convolution with the box-car, and bottom right is what we're considering it to be. Note that there should really be some minus signs in the above diagram.

Fake / equivalent world			Real World
	& ∫^th(t′)dt′	⇔		& h(t)
	& ∫^th(t′)dt′	⇔		& h(t)

Again, top left is what we know (flat frequency spectrum for the source and sine transform of the imaginary part of the field), and top right is what we're after. Also, bottom right is obtained from top left by convolution with the box-car, and bottom right is what we're considering it to be. Note that there should really be some minus signs in the above diagram.

Fake / equivalent world			Real World
	& h(t)	⇔		$\;\&\;\mathbf{\frac{\partial h}{\partial t}(t)}$
	& h(t)	⇔		$\;\&\;\mathbf{\frac{\partial h}{\partial t}(t)}$

Top left is what we have, and right is what we're thinking it is. Bottom left is the convolution with a discretized half-sine, and bottom right is what we're considering it to be: the time-derivative of the H-field for a half-sine waveform.

Integration of cubic splined function

The time-domain voltage or magnetic field ends up being known at a number of discrete, logarithmically/ exponentially-spaced times as a result of Anderson's cosine/sine digital transform. This time-domain function is cubic splined in terms of the logarithms of the times. Hence, between any two discrete times, the time-domain function is approximated by the cubic spline

y(h) = y₀ + q₁ h + q₂ h²+ q₃ h³,

(see routines RSPLN and RSPLE) where h = log x − log t_i, x is the time at which the function y is required, t_i is the i^th time at which y is known (t_i ≤ x ≤ t_i + 1), y₀ = y(log t_i), and q₁, q₂ and q₃ are the spline coefficients. The required integral is

$$\begin{aligned} \int_{x=a}^b y(\log x)\>dx\;&=\;\int_{\log x=\log a}^{\log b}y(\log x)\,x\,d(\log x) \\\ &=\;\int_{\log x=\log a}^{\log b}y(\log x)\,e^{\log x}\,d(\log x) \\\ &=\;\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{(h+\log t_i)}\,dh \\\ &=\;t_i\,\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^h\,dh.\cr \end{aligned}$$

Substituting the polynomial expression for y(h) into the above integral and worrying about each term individually gives:

∫y₀ e^h dh = y₀ e^h,

∫q₁h e^h dh = q₁e^h(h − 1)

(G and R 2.322.1),

∫q₂h²e^h dh = q₂e^h(h² − 2h + 2)

(G and R 2.322.2), and

∫q₃h³e^h dh = q₃e^h(h³ − 3h² + 6h − 6)

(G and R 2.322.3). Hence, summing the integrals above,

$$\begin{aligned} \int_{x=a}^b y(\log x)\>dx\;=&\;t_i\,y_0\Big({b\over t_i}\,-\,{a\over t_i}\Big)\\\ &+\;t_i\,q_1\Big({b\over t_i}(\log b-\log t_i-1)\>-\>{a\over t_i}(\log a-\log t_i-1)\Big)\\\ &+\;t_i\,q_2\Big({b\over t_i}\big((\log b-\log t_i)^2-2(\log b-\log t_i)+2\big)\>-\\\ &\quad\qquad\qquad{a\over t_i}\big((\log a-\log t_i)^2-2(\log a-\log t_i)+2\big)\Big)\\\ &+\;t_i\,q_3\Big({b\over t_i}\big((\log b-\log t_i)^3-3(\log b-\log t_i)^2+6(\log b-\log t_i)-6\big)\>-\\\ &\quad\qquad\qquad{a\over t_i}\big((\log a-\log t_i)^3-3(\log a-\log t_i)^2+6(\log a-\log t_i)-6\big)\Big). \end{aligned}$$

The original plan was to treat a discretised transmitter current waveform as a piecewise linear function (^th i.e.), straight line segments between the provided sampled points), which meant that the response coming out of Anderson's filtering routine was convolved with the piecewise constant time-derivative of the transmitter current waveform to give voltages. This proved to be not good enough for on-time calculations (the step-y nature of the approximation of the time derivative of the transmitter current waveform could be seen in the computed voltages). So it was decided to cubic spline the transmitter current waveform, which gives a piecewise quadratic approximation to the time derivative of the waveform. And so the convolution of the stuff coming out of Anderson's routine is now with a constant, a linear time term and a quadratic term. The involved integral above is still required, along with:

$$\begin{aligned} \int_{x=a}^b x\,y(\log x)\>dx\;&=\;\int_{\log x=\log a}^{\log b}y(\log x)\,x^2\,d(\log x) \\\ &=\;\int_{\log x=\log a}^{\log b}y(\log x)\,e^{2\log x}\,d(\log x) \\\ &=\;\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{(2h+2\log t_i)}\,dh \\\ &=\;t_i^2\,\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{2h}\,dh \end{aligned}$$

Using the integrals above for the various powers of h times e^h, the relevant integrals for the various parts of the cubic spline representation of y(h) are:

$$\int y_0\,e^{2h}\>dh\;=\;y_0\,{1\over2}\,e^{2h},$$

$$\int q_1 h\,e^{2h}\>dh\;=\;q_1 {1\over4} e^{2h}(2h-1),$$

$$\int q_2 h^2 e^{2h}\>dh\;=\;q_2 {1\over8} e^{2h}(4h^2-4h+2),$$

$$\int q_3 h^3 e^{2h}\>dh\;=\;q_3 {1\over16} e^{2h}(8h^3-12h^2+12h-6).$$

The limits for the integral are h = log a − log t_i and h = log b − log t_i. The term e^2h becomes:

$$\begin{aligned} e^{2(\log X-\log t_i)}\;&=\;\big\{e^{(\log X-\log t_i)}\big\}^2 \\\ &=\;\bigg\{{e^{\log X}\over e^{\log t_i}}\bigg\}^2 \\\ &=\;\bigg({X\over t_i}\bigg)^2 \\\ &=\;{X^2\over t_i^2} \end{aligned}$$

where X is either a or b. Hence,

$$\begin{aligned} \int_{x=a}^b x\,y(\log x)\>dx\;=&\;t_i^2\,y_0\Big({b^2\over t_i^2}\,-\,{a^2\over t_i^2}\Big)\\\ &+\;t_i^2\,q_1\,{1\over 4}\Big({b^2\over t_i^2}(2\log b-2\log t_i-1)\>-\>{a^2\over t_i^2}(2\log a-2\log t_i-1)\Big)\\\ &+\;t_i^2\,q_2\,{1\over 8}\Big({b^2\over t_i^2}\big(4(\log b-\log t_i)^2-4(\log b-\log t_i)+2\big)\>-\\\ &\qquad\qquad\qquad{a^2\over t_i^2}\big(4(\log a-\log t_i)^2-4(\log a-\log t_i)+2\big)\Big)\\\ &+\;t_i^2\,q_3\,{1\over 16}\Big({b^2\over t_i^2}\big(8(\log b-\log t_i)^3-12(\log b-\log t_i)^2+12(\log b-\log t_i)-6\big)\>-\\\ &\qquad\qquad\qquad{a^2\over t_i^2}\big(8(\log a-\log t_i)^3-12(\log a-\log t_i)^2+12(\log a-\log t_i)-6\big)\Big). \end{aligned}$$

And

$$\begin{aligned} \int_{x=a}^b x^2\,y(\log x)\>dx\;&=\;\int_{\log x=\log a}^{\log b}y(\log x)\,x^3\,d(\log x) \\\ &=\;\int_{\log x=\log a}^{\log b}y(\log x)\,e^{3\log x}\,d(\log x) \\\ &=\;\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{(3h+3\log t_i)}\,dh \\\ &=\;t_i^3\,\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{3h}\,dh. \end{aligned}$$

And

$$\int y_0\,e^{3h}\>dh\;=\;y_0\,{1\over3}\,e^{3h},$$

$$\int q_1 h\,e^{3h}\>dh\;=\;q_1 {1\over9} e^{3h}(3h-1),$$

$$\int q_2 h^2 e^{3h}\>dh\;=\;q_2 {1\over27} e^{3h}(9h^2-6h+2),$$

$$\int q_3 h^3 e^{3h}\>dh\;=\;q_3 {1\over81} e^{3h}(27h^3-27h^2+18h-6).$$

Hence,

$$\begin{aligned} \int_{x=a}^b x^2\,y(\log x)\>dx\;=&\;t_i^3\,y_0\Big({b^3\over t_i^3}\,-\,{a^3\over t_i^3}\Big)\\\ &+\;t_i^3\,q_1\,{1\over 9}\Big({b^3\over t_i^3}(3\log b-3\log t_i-1)\>-\>{a^3\over t_i^3}(3\log a-3\log t_i-1)\Big)\\\ &+\;t_i^3\,q_2\,{1\over 27}\Big({b^3\over t_i^3}\big(9(\log b-\log t_i)^2-6(\log b-\log t_i)+2\big)\>-\\\ &\qquad\qquad\qquad{a^3\over t_i^3}\big(9(\log a-\log t_i)^2-6(\log a-\log t_i)+2\big)\Big)\\\ &+\;t_i^3\,q_3\,{1\over 81}\Big({b^3\over t_i^3}\big(27(\log b-\log t_i)^3-27(\log b-\log t_i)^2+18(\log b-\log t_i)-6\big)\>-\\\ &\qquad\qquad\qquad{a^3\over t_i^3}\big(27(\log a-\log t_i)^3-27(\log a-\log t_i)^2+18(\log a-\log t_i)-6\big)\Big). \end{aligned}$$

bigskip In the previous two integrals of the product of x and x² with the function splined in terms of log x, the x and x² should really be (B − x) and (B − x)², where B is the end of the relevant interval of the splined transmitter current waveform (because it's convolution that's happening):

I₁ = ∫_x = a^b(B − x) y(log x) dx,

and

I₂ = ∫_x = a^b(B − x)² y(log x) dx.

Also, it was not really x and x² in those integrals because these terms are coming from the cubic splining of the transmitter current waveform, which means that in each interval between discretization points, it should be (x − A) and (x − A)² that are involved, where A is the start of the relevant interval for the transmitter current waveform. Because

$$\begin{aligned} &\int_{x=a}^b\big(x-A\big)\>y(\log x)\>dx\;=\; -\,A\,\int_{x=a}^by(\log x)\>dx\;+\;\int_{x=a}^bx\,y(\log x)\>dx,\quad\hbox{and}\\\ &\int_{x=a}^b\big(x-A\big)^2\>y(\log x)\>dx\;=\; A^2\,\int_{x=a}^by(\log x)\>dx\;-\;2A\,\int_{x=a}^bx\,y(\log x)\>dx\;+\;\int_{x=a}^bx^2\,y(\log x)\>dx \end{aligned}$$

and

$$\begin{aligned} &\int_{x=a}^b\big(B-x\big)\>y(\log x)\>dx\;=\; B\,\int_{x=a}^by(\log x)\>dx\;-\;\int_{x=a}^bx\,y(\log x)\>dx,\quad\hbox{and}\\\ &\int_{x=a}^b\big(B-x\big)^2\>y(\log x)\>dx\;=\; B^2\,\int_{x=a}^by(\log x)\>dx\;-\;2B\,\int_{x=a}^bx\,y(\log x)\>dx\;+\;\int_{x=a}^bx^2\,y(\log x)\>dx \end{aligned}$$

then

$$\begin{aligned} &I_1\;=\;\big(B-A\big)\,\int_{x=a}^by(\log x)\>dx\;-\;\int_{x=a}^b\big(x-A\big)\>y(\log x)\>dx, \qquad\hbox{and}\\\ &I_2\;=\;\big(B-A\big)^2\int_{x=a}^by(\log x)\>dx\;-\; 2\big(B-A)\int_{x=a}^b\big(x-A\big)\>y(\log x)\>dx\;+\; \int_{x=a}^b\big(x-A\big)^2\,y(\log x)\>dx. \end{aligned}$$

Computing Sensitivities

The inverse problem of determining the conductivity and/or susceptibility of the Earth from electromagnetic measurements is nonlinear. Program EM1DTM uses an iterative procedure to solve this problem. At each iteration the linearized approximation of the full nonlinear problem is solved. This requires the Jacobian matrix for the sensitivities, J = (J^σ, J^κ) where:

$$\begin{aligned} \begin{align} J_{ij}^\sigma &= \frac{\partial d_i}{\partial log \, \sigma_j} \\\ J_{ij}^\kappa &= \frac{\partial d_i}{\partial k_j} \end{align} \end{aligned}$$

in which d_i is the i^th observation, and σ_j and κ_j are the conductivity and susceptibility of the j^th layer.

The algorithm for computing the sensitivities is obtained by differentiating the expressions for the H-fields with respect to the model parameters (Farquharson2003). For example, the sensitivity with respect to m_j (either the conductivity or susceptibility of the j^th layer) of the z-component of the H-field for a z-directed magnetic dipole source is given by:

$$\frac{\partial H_z}{\partial m_j} (x,y,z,\omega) = \frac{1}{4\pi} \int_0^\infty \Big ( e^{-\lambda |z+h|} + \frac{\partial}{\partial m_j} \Bigg [ \frac{P_{21}}{P_{11}} \Bigg ] e^{\lambda (z-h)} \Big ) \lambda^2 J_0(\lambda r) d\lambda$$

The derivative of the coefficient is simply:

$$\frac{\partial}{\partial m_j} \Bigg [ \frac{P_{21}}{P_{11}} \Bigg ] = \frac{\partial P_{21}}{\partial m_j} \frac{1}{P_{11}} - \frac{\partial P_{11}}{\partial m_j} \frac{P{21}}{P_{11}^2}$$

where P₁₁ and P₂₁ are elements of the propagation matrix P. The derivative of P with respect to m_j (for 1 ≤ j ≤ M − 1) is

$$\frac{\partial \mathbf{P}}{\partial m_j} = \mathbf{M_1 M_2 ... M_{j-1}} \Bigg ( \frac{\partial \mathbf{M_j}}{\partial m_j} \mathbf{M_{j+1}} + \mathbf{M_j} \frac{\partial \mathbf{M_{j+1}}}{\partial m_j} \Bigg ) \mathbf{M_{j+2} ... M_M}$$

The sensitivities with respect to the conductivity and susceptibility of the basement halfspace are given by

$$\frac{\partial \mathbf{P}}{\partial m_M} = \mathbf{M_1 M_2 ... M_{M-1}} \frac{\partial \mathbf{M_M}}{\partial m_M}$$

The derivatives of the individual layer matrices with respect to the conductivities and susceptibilities are straightforward to derive, and are not given here.

Just as for the forward modelling, the Hankel transform in eq. Sensitivity_z, and those in the corresponding expressions for the sensitivities of the other observations, are computed using the digital filtering routine of Anderson (Anderson1982).

The partial propagation matrices

$$\mathbf{P_k} = \mathbf{M_1} \prod_{j=2}^k \mathbf{M_j}, \;\;\; k=2,...,M$$

are computed during the forward modelling, and saved for re-use during the sensitivity computations. This sensitivity-equation approach therefore has the efficiency of an adjoint-equation approach.

Inversion Methodologies

In program EM1DTM, there are four different inversion algorithms. They all have the same general formulation <theory_inversion_gen>, but differ in their treatment of the trade-off parameter (see fixed trade-off <theory_inversion_fixed>, discrepency principle <theory_inversion_disc>, GCV <theory_inversion_gcv> and L-curve criterion <theory_inversion_lcurve>).

General formulation

The aim of each inversion algorithm is to construct the simplest model that adequately reproduces the observations. This is achieved by posing the inverse problem as an optimization problem in which we recover the model that minimizes the objective function:

Φ = ϕ_d + βϕ_m

The two components of this objective function are as follows. ϕ_d is the data misfit:

ϕ_d = M_d(W_d(d − d^obs))

where d^obs is the vector containing the N observations, d is the forward-modelled data and M_d(x) is some measure of the lenght of a vector x. It is assumed that the noise in the observations is Gaussian and uncorrelated, and that the estimated standard deviation of the noise in the i^th observation is of the form s₀ŝ_i, where ŝ_i indicates the amount of noise in the i^th observation relative to that in the others, and is a scale factor that specifies the total amount of noise in the set of observations. The matrix W_d is therefore given by:

W_d = diag{1/(s₀ŝ₁), ..., 1/(s₀ŝ_N)}

The model-structure component of the objective function is ϕ_m. In its most general form it contains four terms:

$$\begin{aligned} \begin{split} \phi_m =& \; \alpha_s^\sigma M_s \left( \mathbf{W_s^\sigma} \big ( \mathbf{m^\sigma - m_s^{\sigma , ref}} \big ) \right) \\\ &+ \alpha_z^\sigma M_z \left( \mathbf{W_z^\sigma} \big ( \mathbf{m^\sigma - m_z^{\sigma , ref}} \big ) \right)\\\ \end{split} \end{aligned}$$

where m^σ is the vector containing the logarithms of the layer conductivitiesq. The matrix W_s^σ is:

$$\mathbf{W_s^\sigma} = \textrm{diag} \big \{ \sqrt{t_1}, ..., \sqrt{t_{m-1}}, \sqrt{t_{M-1}} \big \}$$

where t_j is the thickness of the j^th layer. And the matrix W_z^σ is:

$$\begin{aligned} \mathbf{W_z^\sigma} = \begin{bmatrix} -\sqrt{\frac{2}{t_1 + t_2}} & \sqrt{\frac{2}{t_1 + t_2}} & & & & \\\ & -\sqrt{\frac{2}{t_2 + t_3}} & \sqrt{\frac{2}{t_2 + t_3}} & & & \\\ & & \ddots & & & \\\ & & & -\sqrt{\frac{2}{t_{M-2} + t_{M-1}}} & \sqrt{\frac{2}{t_{M-2} + t_{M-1}}} & \\\ & & & & -\sqrt{\frac{2}{t_{M-1}}} & \sqrt{\frac{2}{t_{M-1}}} \\\ & & & & & 0 \end{bmatrix} \end{aligned}$$

The rows of any of these two weighting matrices can be scaled if desired (see file for additional model-norm weights<supportingFiles_weight>). The vectors m_s^σ, ref and m_z^σ, ref contain the layer conductivities for the two possible reference models. The two terms in ϕ_m therefore correspond to the “smallest” and “flattest” terms for the conductivity parts of the model. The relative importance of the two terms is governed by the coefficients α_s^σ and α_z^σ, which are discussed in the Fundamentals of Inversion <http://giftoolscook book.readthedocs.io/en/latest/content/fundamentals/Alphas.html#the-alphas- parameters>. The trade-off parameter :math:beta <http://giftoolscookbook .readthedocs.io/en/latest/content/fundamentals/Beta.html#the-beta-parameter- trade-off> balances the opposing effects of minimizing the misfit and minimizing the amount of structure in the model. It is the different ways in which β is determined that distinguish the four inversion algorithms in program EM1DTM from one another. They are described in the next sections.

General Norm Measures

Program EM1DTM uses general measures of the “length” of a vector instead of the traditional sum-of-squares measure. (For more on the use of general measures in nonlinear inverse problems see Farquharson & Oldenburg, 1998). Speciﬁcally, the measure used for the measure of data misﬁt, M_d, is Huber's M -measure:

$$M_d(\mathbf{x}) = \sum_{i=1}^N \rho_H(x_i)$$

where

$$\begin{aligned} \rho_H(x) = \left\{\begin{array}{lr} x^2 & |x| \leq c,\\\ 2cx - c^2 & |x| > c \end{array} \right. \end{aligned}$$

This measure is a composite measure, behaving like a quadratic (i.e., sum-of-squares) measure for elements of the vector less that the parameter c, and behaving like a linear (i.e., l₁-norm) measure for elements of the vector larger than c. If c is chosen to be large relative to the elements of the vector, M_d will give similar results to those for the sum-of-squares measure. For smaller values of c, for example, when c is only a factor of 2 or so greater than the average size of an element of the vector, M_d will act as a robust measure of misﬁt, and hence be less biased by any outliers or other non-Gaussian noise in the observations.

The measure used for the components of the measure of model structure, M_m^s and M_m^z is the l_p-like measure of Ekblom:

$$M_m(\mathbf{x}) = \sum_{j=1}^M \rho_E(x_j)$$

where

ρ_E(x) = (x²+ϵ²)^p/2

The parameter p can be used to vary the behaviour of this measure. For example, with p = 2, this measure behaves like the sum-of-squares measure, and a model constructed using this will have the smeared-out, fuzzy appearance that is characteristic of the sum-of-squares measure. In contrast, for p = 1, this measure does not bias against large jumps in conductivity from one layer to the next, and will result in a piecewise-constant, blocky model. The parameter ϵ is a small number, considerably smaller than the average size of an element of the vector. Its use is to avoid the numerical diﬃculties for zero-valued elements when p < 2 from which the true l_p-norm suﬀers. In program EM1DTM, the values of p can be diﬀerent for the smallest and ﬂattest components of ϕ_m .

General Algorithm

As mentioned in the computing sensitivities <theory_sensitivities> section, the inverse problem considered here is nonlinear. It is solved using an iterative procedure. At the n^th iteration, the actual objective function being minimized is:

Φⁿ = ϕ_dⁿ + βⁿϕ_mⁿ

In the data misfit ϕ_dⁿ, the forward-modelled data dⁿ are the data for the model that is sought at the current iteration. These data are approximated by:

dⁿ = d^n − 1 + J^{σ, n − 1}δm

where δm = mⁿ − m^n − 1 and J^n − 1 is the Jacobian matrix given by Sensitivity and evaluated for the model from the previous iteration. At the n^th iteration, the problem to be solved is that of finding the change, (δm, δm^κ) to the model which minimizes the objective function Φⁿ. Differentiating eq. Objective_Fcn with respect to the components of δm and δm^κ, and equating the resulting expressions to zero, gives the system of equations to be solved. This is slightly more involved now that ϕ_d amd ϕ_m comprise the Huber's M-measure Huber and Ekblom's l_p-like measure, rather than the usual sum-of-squares measures. Specifically, the derivative of HuberMeasure gives:

$$\frac{\partial M_d}{\partial \delta m_k} (\mathbf{x}) = \sum_{i=1}^N \rho^\prime_H (x_i) \frac{\partial x_i}{\partial \delta m_k}$$

where x = W_d(d^n − 1 + J^n − 1δm − d^obs. The vector of the derivatives with respect to the perturbations of the model parameters in all the layers can be written in matrix notation as:

The linear system of equations to be solved for :math:`delta mathbf{m} is therefore:

$$\begin{aligned} \begin{split} & \bigg [ \mathbf{J}^{n-1 \, T} \mathbf{W_d}^T \mathbf{W_d} \mathbf{J}^{n-1} + \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \bigg ] \delta \mathbf{m} = \\\ & \mathbf{J}^{n-1 \, T} \mathbf{W_d}^{n-1} \mathbf{W_d} \big ( \mathbf{d^{obs}} - \mathbf{d}^{n-1} \big ) + \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} \big ( \mathbf{m_i^{ref} - \mathbf{m}^{n-1}} \big ) + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \mathbf{m}^{n-1} \end{split} \end{aligned}$$

where T denotes the transpose and:

$$\begin{aligned} \begin{split} \mathbf{J}^{n-1} &= \big ( \mathbf{J}^{\sigma , n-1} \mathbf{J}^{\kappa , n-1} \big ) \\\ \mathbf{W_1} &= \begin{bmatrix} \sqrt{\alpha_s^\sigma} \mathbf{W}_s^\sigma & 0 \\ 0 & \sqrt{\alpha_s^\kappa} \mathbf{W}_s^\kappa \end{bmatrix} \\\ \mathbf{W_2} &= \begin{bmatrix} \sqrt{\alpha_z^\sigma} \mathbf{W}_z^\sigma & 0 \\ 0 & \sqrt{\alpha_z^\kappa} \mathbf{W}_z^\kappa \end{bmatrix} \\\ \mathbf{m_1^{ref}} &= \big ( \mathbf{m}_s^{\sigma , ref \, T} \mathbf{m}_s^{\kappa , ref \, T} \big )^T \\\ \mathbf{m_2^{ref}} &= \big ( \mathbf{m}_z^{\sigma , ref \, T} \mathbf{m}_z^{\kappa , ref \, T} \big )^T \\\ \mathbf{\hat{X}}^{n-1} &= \big ( 0 \, (\mathbf{X}^{n-1})^{-1} \big ) \end{split} \end{aligned}$$

where X̂^n − 1 = diag{m₁^{κ, n − 1}, ..., m_M^{κ, n − 1}}. The solution to eq. Systemdm is equivalent to the least-squares solution of:

$$\begin{aligned} \begin{bmatrix} \mathbf{W_d J}^{n-1} \\ \sqrt{\beta^n} \mathbf{W_1} \\ \sqrt{\beta^n} \mathbf{W_2} \\ \sqrt{\gamma^n/2} \, \mathbf{\hat{X}}^{n-1} \end{bmatrix} \delta \mathbf{m} = \begin{bmatrix} \mathbf{W_d } ( \mathbf{d^{obs} - d}^{n-1} ) \\ \sqrt{\beta^n} \mathbf{W_1} ( \mathbf{m_1^{ref} - m}^{n-1} ) \\ \sqrt{\beta^n} \mathbf{W_2}( \mathbf{m^{ref} - m}^{n-1} ) \\ \sqrt{\gamma^n/2} \, \mathbf{\hat{X}}^{n-1} \mathbf{m}^{n-1} \end{bmatrix} \end{aligned}$$

Once the step δm has been determined by the solution of eq. Systemdm or eq. SystemdmLSQ, the new model is given by:

mⁿ = m^n − 1 + νδm

There are two conditions on the step length ν. First, if positivity of the layer susceptibilities is being enforced:

νδκ_j > − κ_j^n − 1

must hold for all j = 1, ..., M. Secondly, the objective function must be decreased by the addition of the step to the model:

ϕ_dⁿ + βⁿϕ_mⁿ − γⁿϕ_LBⁿ < ϕ_d^n − 1 + βⁿϕ_m^n − 1 − γⁿϕ_LB^n − 1

where ϕ_dⁿ is now the misfit computed using the full forward modelling for the new model mⁿ. To determine mⁿ, a step length (ν) of either 1 or the maximum value for which eq. cond1 is true (whichever is greater) is tried. If eq. cond2 is true for the step length, it is accepted. If eq. cond2 is not true, ν is decreased by factors of 2 until it is true.

Algorithm 1: fixed trade-off parameter

The trade-off parameter, β, remains fixed at its user-supplied value throughout the inversion. The least-squares solution of eq. SystemdmLSQ is used. This is computed using the subroutine “LSQR” of Paige & Saunders (Paige1982). If the desired value of β is known, this is the fastest of the four inversion algorithms as it does not involve a line search over trial values of β at each iteration. If the appropriate value of β is not known, it can be found using this algorithm by trail-and-error. This may or may not be time-consuming.

Algorithm 2: discrepancy principle

If a complete description of the noise in a set of observations is available - that is, both s₀ and ŝ_i (i = 1, ..., N) are known - the expectation of the misfit, E(ϕ_d), is equal to the number of observations N. Algorithm 2 therefore attempts to choose the trade-off parameter so that the misfit for the final model is equal to a target value of chifac × N. If the noise in the observations is well known, chifac should equal 1. However, chifac can be adjusted by the user to give a target misfit appropriate for a particular data-set. If a misfit as small as the target value cannot be achieved, the algorithm searches for the smallest possible misfit.

Experience has shown that choosing the trade-off parameter at early iterations in this way can lead to excessive structure in the model, and that removing this structure once the target (or minimum) misfit has been attained can require a significant number of additional iterations. A restriction is therefore placed on the greatest-allowed decrease in the misfit at any iteration, thus allowing structure to be slowly but steadily introduced into the model. In program EM1DTM, the target misfit at the n^th iteration is given by:

ϕ_d^n, tar = max(mfac × ϕ_d^n − 1, chifac × N)

where the user-supplied factor mfac is such that 0.1 ≤ mfac ≤ 0.5.

The step δm is found from the solution of eq. SystemdmLSQ using subroutine LSQR of Paige & Saunders (Paige1982). The line search at each iteration moves along the ϕ_d versus log β curve until either the target misfit, ϕ_d^n, tar, is bracketed, in which case a bisection search is used to converge to the target, or the minimum misfit ( > ϕ_d^n − 1) is bracketed, in which case a golden section search (for example, Press et al., 1986) is used to converge to the minimum. The starting value of β for each line search is β^n − 1. For the first iteration, the β ( = β₀) for the line search is given by N/ϕ_m(m^†), where m^† contains typical values of conductivity and/or susceptibility. Specifically, m^† is a model whose top M/5 layers have a conductivity of 0.02 S/m and susceptibility of 0.02 SI units, and whose remaining layers have a conductivity of 0.01 S/m and susceptibility of 0 SI units. Also, the reference models used in the computation of ϕ_m(m^†) are homogeneous halfspaces of 0.01 S/m and 0 SI units. The line search is efficient, but does involve the full forward modelling to compute the misfit for each trial value of β.

Algorithm 3: GCV criterion

If only the relative amount of noise in the observations is known - that is, ŝ_i(i = 1, ..., N) is known but not s₀ -the appropriate target value for the misfit cannot be determined, and hence Algorithm 2 is not the most suitable. The generalized cross-validation (GCV) method provides a means of estimating, during the course of an inversion, a value of the trade-off parameter that results in an appropriate fit to the observations, and in so doing, effectively estimating the level of noise, s₀, in the observations (see, for example, Wahba1990; Hansen1998).

The GCV method is based on the following argument (Wahba1990; Haber1997; Haber2000). Consider inverting all but the first observation using a trial value of β, and then computing the individual misfit between the first observation and the first forward-modelled datum for the model produced by the inversion. This can be repeated leaving out all the other observations in turn, inverting the retained observations using the same value of β, and computing the misfit between the observation left out and the corresponding forward-modelled datum. The best value of β can then be defined as the one which gives the smallest sum of all the individual misfits. For a linear problem, this corresponds to minimizing the GCV function. For a nonlinear problem, the GCV method can be applied to the linearized problem being solved at each iteration (Haber1997; Haber2000; Li2003; Farquharson2000). From eq. Systemdm, the GCV function for the n^th iteration is given by:

$$GCV (\beta ) = \dfrac{\big \| \mathbf{W_d \hat{d} - W_d J}^{n-1} \mathbf{M}^{-1} \big ( \mathbf{J}^{n-1 \, T} \mathbf{W_d}T \mathbf{W_d \hat{d} + r} \big ) \big \|^2 }{\big [ \textrm{trace} \big ( \mathbf{I - W_d J}^{n-1} \mathbf{M}^{-1} \mathbf{J}^{n-1 \, T} \mathbf{W_d}^T \big ) \big ]^2}$$

where

$$\begin{aligned} \begin{split} \mathbf{M} (\beta) &= \bigg [ \mathbf{J}^{n-1 \, T} \mathbf{W_d}^T \mathbf{W_d} \mathbf{J}^{n-1} + \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \bigg ] \\\ \mathbf{r} &= \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} \big ( \mathbf{m_i^{ref} - \mathbf{m}^{n-1}} \big ) + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \mathbf{m}^{n-1} \end{split} \end{aligned}$$

and d̂ − d^obs − d^n − 1. If β^* is the value of the trade-off parameter that minimizes eq. GCV at the n^th iteration, the actual value of β used to compute the new model is given by:

β_n = max(β^*, bfac × β^n − 1)

where the user-supplied factor bfac is such that 0.01 < bfac < 0.5. As for Algorithm 2, this limit on the allowed decrease in the trade-off parameter prevents unnecessary structure being introduced into the model at early iterations. The inverse of the matrix M required in eq. GCV, and the solution to eq. Systemdm given this inverse, is computed using the Cholesky factorization routines from LAPACK (Anderson1999). The line search at each iteration moves along the curve of the GCV function versus the logarithm of the trade-off parameter until the minimum is bracketed (or bfac × β^n − 1 reached), and then a golden section search (e.g., Press et al., 1986) is used to converge to the minimum. The starting value of β in the line search is β^n − 1 ( β⁰ is estimated in the same way as for Algorithm 2). This is an efficient search, even with the inversion of the matrix M.

Algorithm 4: L-curve criterion

As for the GCV-based method <theory_inversion_gcv>, the L-curve method provides a means of estimating an appropriate value of the trade-off parameter if only ŝ_i, i = 1, ..., N are known and not s₀. For a linear inverse problem, if the data misfit ϕ_d is plotted against the model norm ϕ_m for all reasonable values of the trade-off parameter β, the resulting curve tends to have a characteristic "L"-shape, especially when plotted on logarithmic axes (see, for example, Hansen1998). The corner of this L-curve corresponds to roughly equal emphasis on the misfit and model norm during the inversion. Moving along the L-curve away from the corner is associated with a progressively smaller decrease in the misfit for large increases in the model norm, or a progressively smaller decrease in the model norm for large increases in the misfit. The value of β at the point of maximum curvature on the L-curve is therefore the most appropriate, according to this criterion.

For a nonlinear problem, the L-curve criterion can be applied to the linearized inverse problem at each iteration (Li & Oldenburg, 1999; Farquharson & Oldenburg, 2000). In this situation, the L-curve is defined using the linearized misfit, which uses the approximation given in eq. DataPerturb for the forward-modelled data. The curvature of the L-curve is computed using the formula (Hansen, 1998):

$$C(\beta) = \frac{\zeta^\prime \eta^{\prime \prime } - \zeta^{\prime\prime} \eta^\prime}{\big ( (\zeta^\prime)^2 + (\eta^\prime)^2 \big )^{3/2}}$$

where ζ = log ϕ_d^lin and η = log ϕ_m. The prime denotes differentiation with respect to log β. As for both Algorithms 2 <theory_inversion_disc> & 3 <theory_inversion_gcv>, a restriction is imposed on how quickly the trade-off parameter can be decreased from one iteration to the next. The actual value of β chosen for use at the n^th th iteration is given by eq. betachoice, where β^* now corresponds to the value of β at the point of maximum curvature on the L-curve.

Experience has shown that the L-curve for the inverse problem considered here does not always have a sharp, distinct corner. The associated slow variation of the curvature with β can make the numerical differentiation required to evaluate eq. zetaeq prone to numerical noise. The line search along the L-curve used in program EM1DTM to find the point of maximum curvature is therefore designed to be robust (rather than efficient). The L-curve is sampled at equally-spaced values of log β, and long differences are used in the evaluation of eq. zetaeq to introduce some smoothing. A parabola is fit through the point from the equally-spaced sampling with the maximum value of curvature and its two nearest neighbours. The value of β at the maximum of this parabola is taken as β^*. In addition, it is sometimes found that, for the range of values of β that are tried, the maximum value of the curvature of the L-curve on logarithmic axes is negative. In this case, the curvature of the L-curve on linear axes is investigated to find a maximum. As for Algorithms 1 & 2, the least-squares solution to eq. SystemdmLSQ is used, and is computed using subroutine LSQR of Paige & Saunders (Paige1982).

Relative weighting within the model norm

The four coefficients in the model norm (see eq. MOF) are ultimately the responsibility of the user. Larger values of α_s^σ relative to α_z^σ result in constructed conductivity models that are closer to the supplied reference model. Smaller values of α_s^σ and α_z^σ result in flatter conductivity models. Likewise for the coefficients related to susceptibilities.

If both conductivity and susceptibility are active in the inversion, the relative size of α_s^σ & α_z^σ to α_s^κ & α_z^κ is also required. Program EM1DTM includes a simple means of calculating a default value for this relative balance. Using the layer thicknesses, weighting matrices W_s^σ, W_z^σ, W_s^κ & W_z^κ, and user-supplied weighting of the smallest and flattest parts of the conductivity and susceptibility components of the model norm (see acs, acz, ass & asz in the input file description<inputEM1DTM>, line 5), the following two quantities are computed for a test model m^*:

$$\begin{aligned} \begin{split} \phi_m^\sigma &= acs \big \| \mathbf{W_s^\sigma} \big ( \mathbf{m}^* - \mathbf{m}_s^{\sigma, ref} \big ) \big \|^2 + acz \big \| \mathbf{W_z^\sigma} \big ( \mathbf{m}^* - \mathbf{m}_z^{\sigma, ref} \big ) \big \|^2 \\\ \phi_m^\kappa &= ass \big \| \mathbf{W_s^\kappa} \big ( \mathbf{m}^* - \mathbf{m}_s^{\kappa, ref} \big ) \big \|^2 + asz \big \| \mathbf{W_z^\kappa} \big ( \mathbf{m}^* - \mathbf{m}_z^{\kappa, ref} \big ) \big \|^2 \end{split} \end{aligned}$$

The conductivity and susceptibility of the top N/5 layers in the test model are 0.02 S/m and 0.02 SI units respectively, and the conductivity and susceptibility of the remaining layers are 0.01 S/m and 0 SI units. The coefficients of the model norm used in the inversion are then α_s^σ = acs, α_z^σ = acz, α_s^κ = A^s × ass & α_z^κ = A^d × asz where A^sϕ_m^σ/ϕ_m^κ. It has been found that a balance between the conductivity and susceptibility portions of the model norm computed in this way is adequate as an initial guess. However, the balance usually requires modification by the user to obtain the best susceptibility model. (The conductivity model tends to be insensitive to this balance.) If anything, the default balance will suppress the constructed susceptibility model.

Positive susceptibility

ProgramEM1DTM can perform an unconstrained inversion for susceptibilities (along with the conductivities) as well as invert for values of susceptibility that are constrained to be positive. Following Li & Oldenburg (Li2003), the positivity constraint is implemented by incorporating a logarithmic barrier term in the objective function (see eqs. ObjectiveFun & barrier_cond). For the initial iteration, the coefficient of the logarithmic barrier term is chosen so that this term is of equal important to the rest of the objective function:

$$\gamma^0 = \frac{\phi_d^0 + \beta^0 \phi_m^0}{- \phi^0_{LB}}$$

At subsequent iterations, the coefficient is reduced according to the formula:

γⁿ = (1 − min(ν^n − 1, 0.925))γ^n − 1

where ν^n − 1 is the step length used at the previous iteration. As mentioned at the end of the general formulation <theory_inversion_gen>, when positivity is being enforced, the step length at any particular iteration must satisfy eq. cond1.

Convergence criteria

To determine when an inversion algorithm has converged, the following criteria are used (Gill1981):

$$\begin{aligned} \begin{split} \Phi^{n-1} - \Phi^n &< \tau (1 + \Phi^n )\\\ \| \mathbf{m}^{n-1} - \mathbf{m} \| &< \sqrt{\tau} (1 + \| \mathbf{m}^n \| ) \end{split} \end{aligned}$$

where τ is a user-specified parameter. The algorithm is considered to have converged when both of the above equations are satisfied. The default value of τ is 0.01.

In case the algorithm happens directly upon the minimum, an additional condition is tested:

∥gⁿ∥ ≤ ϵ

where ϵ is a small number close to zero, and where the gradient, gⁿ, at the n^th iteration is given by:

$$\mathbf{g}^n = -2 \mathbf{J}^{n \, T} \mathbf{W_d}^T \mathbf{W_d} ( \mathbf{d^{obs}} - \mathbf{d}^n ) - 2 \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} \big ( \mathbf{m_i^{ref} - \mathbf{m}^{n-1}} \big ) - \gamma^n \mathbf{\hat{X}}^{n2T} \mathbf{m}^{n}$$

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

theory.rst

theory.rst

Background theory

Introduction

Details regarding the source and receiver

Details regarding the domain

Details regarding the data

Forward Modeling

Frequency- to time-domain transformation -- Part I

Frequency- to time-domain transformation

Integration of cubic splined function

Computing Sensitivities

Inversion Methodologies

General formulation

General Norm Measures

General Algorithm

Algorithm 1: fixed trade-off parameter

Algorithm 2: discrepancy principle

Algorithm 3: GCV criterion

Algorithm 4: L-curve criterion

Relative weighting within the model norm

Positive susceptibility

Convergence criteria

Files

theory.rst

Latest commit

History

theory.rst

File metadata and controls

Background theory

Introduction

Details regarding the source and receiver

Details regarding the domain

Details regarding the data

Forward Modeling

Frequency- to time-domain transformation -- Part I

Frequency- to time-domain transformation

Integration of cubic splined function

Computing Sensitivities

Inversion Methodologies

General formulation

General Norm Measures

General Algorithm

Algorithm 1: fixed trade-off parameter

Algorithm 2: discrepancy principle

Algorithm 3: GCV criterion

Algorithm 4: L-curve criterion

Relative weighting within the model norm

Positive susceptibility

Convergence criteria